Journal of Applied Psychology 


Joun G. Dartey, Editor 
University oF MINNESOTA 


Lorraine Bourtuitet, Managing Editor 





Table of Contents 


The Influence of the Spatial Positioning of Stimulus and Response Components on Performance 
of a Repetitive Key-Pressing Task: N. H. Anderson, D. A. Grant, and C.O. Nystrom... .. 


An a of the Shape of Learning Curves for Industrial Motor Tasks: J. G. Taylor and 
. C. Smith: : 


The Selection of Graduate Students in Public Health Education: R. P. Barthol and B. A. Kirk.. 


A Comparison of Successful and Unsuccessful Students in the Medical School at the University 
of Minnesota: V. H. Hewer 


The Development and Standardization of a Preliminary Form of an Activity Experience In- 
ventory: A Measure of Manifest Interest: W. P. Ewens 


Fakability of the Gordon Personal Profile: J. T. Rusmore 
Evaluation of Angular Digits and Comparisons with a Conventional Set: P. J. Foley 


Dimensional Analysis of Motion: IX. Comparison of Visual and Nonvisual Control of Com- 
ponent Movements: J. Huiskamp, R. C. Smader, and K. U. Smith 


A New Technique for Rapid Item Analysis: C. A. Cuadra 

A Methodological Note on Time Intervals Between Consecutive Accidents: A. Mintz......... 
A Note on the “Fakability” of the Minnesota Teacher Attitude Inventory: A. G. Sorenson 

A Note on Measuring “Understandability”: R. F. Lockman 

GATB in Foreign Countries: B. J. Dvorak 





American Psychological Association 


Volume 40, Number 3 June, 1956 





Consulting Editors 


Harold E. Burtt, Ohio State University 

Alphonse Chapanis, Johns Hopkins Univer- 
sity 

Clifford E. Jurgensen, Minneapolis Gas 
Company 

Laurence S. McGaughran, University of 
Houston 

Quinn McNemar, Stanford University 


Alexander Mintz, City College of New York 
Harold F. Rothe, Fairbanks, Morse and 
Company 
Julian B. Rotter, Ohio State University 
Thomas A. Ryan, Cornell University 
Donald E. Super, Columbia University 
Miles A. Tinker, University of Minnesota 
Alfred C. Welch, University of New Mexico 





This journal gives primary consideration to origi- 
nal investigations in any field of applied psychol- 
ogy except clinical and consulting psychology, al- 
though a descriptive or theoretical article may be 
accepted if it represents a special contribution in 
an applied field. Quantitative investigations of in- 
terest or value to psychologists working in the fol- 
lowing broad fields will be considered: vocational 
and educational prognosis, diagnosis, and guidance 
at the secondary and college level; personnel re- 
search in business, industry, and government; bio- 
mechanics; industrial working conditions; research 
on opinion and morale factors; job analysis and 
classification research; market and advertising re- 
search. 


Because of the large number of manuscripts sub- 
mitted, authors should adhere to the rule of 


“brevity consistent with clarity.” The typical 
manuscript should run to approximately 4,000 
words. There is a lag of approximately twelve 
months between receipt and publication of an 
article. Authors may request advanced publica- 
tion if they are prepared to pay the cost of print- 
ing the necessary extra pages. 


Manuscripts should be addressed to the Editor, 
John G. Darley, 408 Johnston Hall, University of 
Minnesota, Minneapolis 14, Minnesota. All manu- 
scripts should be submitted in duplicate. Original 
figures are prepared for publication; duplicate fig- 
ures may be photographic or pencil-drawn copies. 

Manuscripts must conform to the style require- 
ments described in the “Publication Manual of the 
American Psychological Association,” Psychol. Bull., 
1952, 49, No. 4, Part 2. 





Journal of Applied Psychology 


Published bimonthly by the 
American Psychological Association 
Prince and Lemon Sts., Lancaster, Pa. 
and 1333 Sixteenth Street N.W. 
Washington 6, D. C. 


$8.00 per volume $1.50 per issue 

Subscriptions, orders, and business communications should be addressed to the American Psychological Association, 
1333 Sixteenth St. N.W., Washington 6, D. C. Address changes must reach the subscription office by the 15th of 
the month to take effect the following month. Undelivered copies resulting from address changes will not be replaced; 
subscribers should notify the post office that they will guarantee second-class forwarding postage. Other claims for 
undelivered copies must be made within four months of publication. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the act of Ma:-h %, 1879. 


Acceptance for mailing at the special rate of postage provided for in paragraph (d-2), Section 34.40, P. L. & R. 
of 1948, authorized October 10, 1947. 


Copyright © 1956 by the American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 40, No. 3 


JUNE, 1956 








The Influence of the Spatial Positioning of Stimulus and 
Response Components on Performance of a Repetitive 
Key-Pressing Task ° 


Norman H. Anderson, David A. Grant, and Charles O. Nystrom * 


University of Wisconsin 


This study investigates the relative effi- 
ciencies of a number of spatial positionings 
of a stimulus panel and a response keyboard 
used in a repetitive key-pressing task. In 
designing display-control arrangements it is 
generally considered best to place the visual 
display directly in front of the operator’s 
eyes and to have the hand controls con- 
veniently centered in front of the operator 
and in direct stimulus-response correspond- 
ence with respect to the elements of the dis- 
play (3, 4, 5, 6, 7, 9). In complicated dis- 
play-control arrangements, as in an aircraft 
cockpit, the competition for the optimal space 
is critical, and it is frequently necessary to 
arrange display or control components in less 
than optimal positions and correspondence. 
It therefore becomes important to have some 
measure of the degradation of performance to 
be expected with suboptimal display-control 
arrangements. The present study compares 
eight suboptimal arrangements with the (pre- 
sumably) optimal arrangement of a visual 
display and a finger-operated control board. 

The general problem of the location of the 
work space has been dealt with by time-and- 
motion study engineers and has been ade- 
quately summarized by Chapanis, Garner, 
and Morgan (2, pp. 331-364; 9). There 
have also been studies of location discrimina- 
tion (3), the compatibility or correspondence 


1 This research was supported in part by the USAF 
under contract AF 18(600)-54 monitored by the Aero 
Medical Laboratory of the Wright Air Development 
Center, Wright-Patterson AFB, Ohio, and by the 
Research Committee of the Graduate School of the 
University of Wisconsin. 

2 Now at the University of Delaware. 


of eight green response information lights. 


of responses to stimulus displays (5, 6, 7), 
and the effect of tilting key-pressing control 
panels (8). The present study involves com- 
parisons of various locations of the complete 
visual display and the complete set of finger 
controls in continuous and intermittent psy- 
chomotor performance. In the present ex- 
periment nine arrangements of the stimulus 
panel and response keyboard were used. The 
stimulus panel occupied the right, left, and 
front positions relative to the operator, and 
the response keyboard occupied similar po- 
sitions, independently. Time-and-error in- 
dices of operator efficiency were investigated 
as they were affected by the spatial position- 
ing of these two components. 


Method 


Apparatus. The Multiple Serial Discrimeter (MSD) 
used in this experiment is the same as that employed 
in previous experiments (6, 7), except for the sub- 
stitution of a new light-touch keyboard with piano- 
like keys for the typewriter-like keys used in the 
earlier work. The MSD has been adequately de- 
scribed in earlier reports so that only a brief sum- 
mary of its characteristics will be given here. 

The stimulus display panel consists of a row of 
eight red stimulus lights lying directly above a row 
Illumina- 
tion of a subset of three red lights constituted a 
stimulus pattern. The green lights are activated 
whenever the corresponding key on the response key- 
board is pressed. A stimulus programming unit 
based on two Western Union tape transmitters pro- 
duces the successive stimulus patterns from punched 
paper tapes. Under self-pacing (SP) the tapes are 
advanced .01 sec. after the operator has matched the 
current pattern. With automatic pacing (AP) the 
new pattern comes on after a preset interval. A 20- 
pen Esterline-Angus Graphic Operations Recorder 


137 





138 Norman H. Anderson, David A 


gives a continuous record of stimuli and responses. 

The operator sat in a chair 20 in. high, with a 
safety belt used to restrict his movements. Stimulus 
panel and response keyboard were mounted on small 
movable tables 30% in. high; with each arrange- 
ment the center of the response keyboard was 15 in. 
away from the center of the chair, and the stimulus 
panel was 25 in. distant from the center of the chair. 
The stimulus lights in the display were 40 in. above 
the floor. 

Design and procedure. The experimental design 
was a 9 X 9 latin square with two replications. The 
nine treatments were the nine possible display-con- 
trol positionings. These are denoted by pairs of the 
letters, L, R, F (indicating the left, right, and front 
positions relative to the operator) with the first 
letter of a pair giving the position of the display 
panel, the second giving the position of the response 
keyboard. Within each treatment, operators matched 
.wo blocks of 25 patterns, one given under SP, the 
other under AP. The sequence of blocks was either 
AP, SP; SP, AP; AP, SP; ..., or SP, AP; AP, 
SP; SP, AP; .... When any operator received 
one of these sequences, his replicate received the 
other. In addition, an initial warm-up block was 
given under SP using the FF treatment. 

Operators were strapped in the chair and given 
tape-recorded instructions. At the beginning of each 
block of patterns, the operator sat with hands on 
knees observing a fixation point 6 ft. high on a wall 
6 ft. distant. In response to a cue signaling the first 


Grant, and Charles O. Nystrom 


pattern, he proceeded to match this pattern. Under 
SP, he continued in action for the remainder of the 
block of patterns. Under AP, he resumed his initial 
posture until the cue signaled the next pattern. Two 
signal cues were available: watching the display ob- 
liquely, or listening for the distinct click emitted by 
the apparatus in advancing the new pattern. Op- 
erators generally preferred the latter cue. Time be- 
tween successive patterns for AP treatments was 
about 6 sec., which was practically always ample to 
allow both matching and resumption of initial pos- 
ture. This interval was occasionally but unsys- 
tematically changed by as much as 2 sec. in order to 
avoid conditioning to the time interval. The 1-min. 
interblock rest allowed sufficient time to change dis- 
play-control positioning when a new treatment was 
to be used. 

Subjects. The Ss or operators were 18 male stu- 
dents at the University of Wisconsin who had vol- 
unteered to serve as paid Ss at the Laboratory of 
Experimental Psychology. In order to reduce prac- 
tice effects, operators were selected unsystematically 
from those who had already served in an earlier ex- 
periment using the same apparatus in a nearly identi- 
cal task. Two additional conditions were imposed: 
(a) all operators were right-handed, and (6) all op- 
erators had scored under 50 sec. on each of the last 
four trials of matching a self-paced block of 25 3- 
light patterns in the earlier experiment. In all cases 
there was a 7-day interval between the operator's 
performance in the two experiments. 


Table 1 


Mean Scores per Pattern for Response Times, Latencies, and Errors: 
Each Mean the Average of 15 Responses from 18 Ss 


Location of Response Keyboard 


Left 
Left 1.84 


Front 1.97 
Right 2.30 


Left 
Front 


Right 


10 
2 4 
31 


Left 
Front 
Right 


Location of 44 
AT 
Se 


Stimulus 
Panel 


Left 
Front 
Right 


81 
79 
31 
Left 

Front 
Right 


A7 


Front Right 


1.93 
1.68 
1.94 


2.19 
1.96 
1.90 


Response time 
automatic pacing 
(.058) 


1.12 
0.98 
1.15 


1.23 
1.16 
1.16 


Latency 
automatic pacing 
(.050) 


1.41 
1.38 
1.42 


1.60 
1.45 
1.48 


Response time 
self-paced 
(.036) 
2.03 


1.91 
1.83 


2.57 
1.93 
1.93 


Errors per pattern 
automatic pacing 


(1.145) 


1.04 
1.08 
1.20 


1.36 
1.26 
1.47 


Errors per pattern 
self-paced 


(0.642) 








Influence of Spatial Positioning of Stimulus and Response Components 


Results 


Only the last 15 stimulus patterns of each 
block of 25 were scored, the first 10 being 
used as warm-up. For both the SP and AP 
blocks, the response time (defined as time 
from onset of stimulus to the correct re- 
sponse) to the nearest .1 sec. and number of 
errors per 15-trial block were recorded. In 
addition, the latencies (from onset of stimu- 
lus to initial response) were measured for AP 
blocks. The 15 patterns were scored as a 
whole for the SP blocks. For AP blocks, the 
responses were read individually although the 
scores for the 15 patterns were used in the 
analysis. 

The numerical scores are summarized in 
Table 1. The five scores reported in Table 1 
consist of average response time, latency, and 
errors per pattern under AP, and average re- 
sponse time and errors per pattern under SP. 
The error variances for each score from the 
9 x 9 latin squares are given in parentheses 
in the right-hand column. 

Since angle between stimulus display and 
control keyboard was considered the most 
important single physical variable, the time 
scores have been plotted in Fig. 1 as a func- 
tion of the angle between display and con- 
trol units. The angle has nominal values of 
0°, 90° and 180°. Each display-control con- 
dition is designated by the letter pair beside 
the data points in Fig. 1. The error data 
presented in Table 1 exhibit the same trend as 
the time measures but with less uniformity. 

All scores were analyzed for treatment ef- 
fects, practice effects (columns) and _indi- 
vidual differences (rows) for the pair of latin 
squares. Treatment effects were then further 
analyzed.to evaluate effects of display place- 
ment, control placement, and the display- 
control placement interaction.* Finally the 
treatment means were ordered and subjected 
to the first step of the Tukey gap test (10), 
using the .01 confidence level, in order to 
determine which specific display-control ar- 


3 Analysis of variance summary tables for these 
tests have been deposited with the American Docu- 
mentation Institute. Order Document No. 4725 from 
ADI Publications Project, Photoduplication Service, 
Library of Congress, Washington 25, D. C., remit- 
ting in advance $1.25 for microfilm or $1.25 for pho- 
tocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 





© LATENCY 

0 RESPONSE TIME 

F-FRONT, L- LEFT, R- RIGHT POSITION 

FL- STIMULUS PANEL IN FRONT POSITION, 
RESPONSE KEYBOARD IN LEFT POSITION 


ORL 


OLR [ 


MEAN SECONDS PER PATTERN 


oRR 
eLL 


be 


oFrF 
4 5 
AUTOMATICALLY ~ PACED SELF ~ PACED 
o° 90° 180° o° 90° 
ANGULAR SEPARATION 














1 
180° 


Fic. 1. Variation of mean time per response pat- 
tern with angular separation of stimulus panel and 
response keyboard: response time and latency for 
automatic pacing; response time for self-pacing. 


rangements were superior to others. Because 
the error scores showed significant variation 
only between operators, their further analysis 
will not be discussed. 


AP Procedure 


Response time. The response time in- 
creases in Fig. 1 with increasing angle be- 
tween display panel and control keyboard. 
Within a given angle, however, the various 
procedures are reasonably homogeneous; for 
example, the RF, LF, FR, and FL averages 
at 90° lie within a one-tenth sec. interval. 
Effects of display panel positions, response 
keyboard position and their interaction are 
all significant (p < .001). Significant indi- 
vidual differences (p < .001) and learning ef- 
fects (p< .001) were also obtained. The 
Tukey gap test gave the following separation 
of means: 


FF < LL < RR < (LF, RF) 
< (FR, FL) < LR < RL. 





140 


Here, as below, a < sign indicates that the 
treatment on the left had a significantly lower 
mean than the treatment on the right. Pa- 
rentheses enclose treatments that are not sta- 
tistically separable from one another. The 
gap test shows that within a single angular 
separation of display and control elements it 
is more important to center the control key- 
board than to have the stimulus display in 
the optimal position. 

Latency. Although the same trends appear 
in the time to initial response or latency data 
as in the time to correct response data, the 
magnitudes of the differences are smaller. 
Aside from individual differences (p < .001), 
only keyboard position affected the latency 
score significantly (p < .05). The gap test 
gave the following separation of treatments: 


FF < LL < LF < (RF. RR, FR) 
< (FL,LR) < RL. 
SP Procedure 


The response time measures under the SP 
procedure show essentially the same features 
as response times under AP procedures. The 
range of variation is, however, considerably 
smaller. Aside from significant individual 


differences (p < .001) and the practice ef- 
fect (p < .05), only the position of the re- 
sponse keyboard had a significant effect (p 


< .05). The gap test gave the following 
separation of treatments: 


FF < (LF, RF) < (LL, FR) 
< (FL,RR) < RL < LR. 


Here some of the centered arrangements were 
not superior to the 90° separations in con- 
trast to the corresponding AP results. 


Discussion 


The results of the experiment show that 
the angular separation of the display and 
control units and the absolute position of the 
display and control units significantly affect 
the time scores under both the AP and the 
SP pattern-matching procedures. The abso- 
lute position of the response keyboard was 
more important than the absolute position of 
the stimulus display. The AP scores were 
about three times as sensitive as the SP scores 


Norman H. Anderson, David A. Grant, and Charles O. Nystrom 


to the angular separation of display and con- 
trol units. The percentage loss from best to 
worst spatial separation for SP was 10% to 
15% and for AP 30% to 40%. The greater 
difference in AP performance was presumably 
caused by the fact that the operator had first 
to respond to the cue and then to the stimu- 
lus pattern proper. In doing so he had to 
execute certain gross bodily movements. With 
SP these postural readjustments were not re- 
quired, and the cue and stimulus were identi- 
cal. The greater sensitivity of the AP pro- 
cedure to time degradation is of special sig- 
nificance because it approximates more closely 
the operating conditions in many practical 
monitoring situations. 

The AP response times of Fig. 1 show a 
degradation of .6 sec. in going from most to 
least favorable treatment. Comparing these 
data with the latencies, it is seen that half 
this loss is due to increased latency, with the 
other half arising in the manipulative process 
itself. It should be noticed that this was not 
accompanied by a significant change in num- 
ber of errors. 

When the above results are compared with 
the results of earlier studies in this series (6, 
7) and those of Fitts and Seeger (5), it is 
seen that a less preferred or suboptimal spa- 
tial positioning of display and control key- 
board produces a much smaller degradation in 
reaction time and response accuracy than re- 
sults from interference with the natural cor- 
respondence between the stimulus and _ re- 
sponse components. Increases of response 
time from 10% to 40° were found in this 
experiment. Linear transposition of stimulus 
and response elements and disturbance of 
the angular correspondence between stimulus 
lights and response keys were found to give 
time increases as great as 1500% and 100%, 
respectively. Hence, if departures from opti- 
mal display-control relations are required, it 
is probably best to move the display unit or 
the keyboard laterally as units rather than to 
rearrange the individual keys or response ele- 
ments. 

Summary 

Results are reported of an experiment in- 
vestigating operator efficiency in a key-press- 
ing task as a function of spatial positioning 





Influence of Spatial Positioning of Stimulus and Response Components 


of the stimulus panel and response keyboard. 
The stimulus panel and the response key- 
board occupied positions that were to the left, 
right, or in front of the operator. The nine 
possible combinations of positions of stimu- 
lus display and response keyboard were used 
as treatments, using a balanced experimental 
design on 18 Ss. Two modes of stimulus 
presentation were employed within each treat- 
ment: under self-pacing, S kept his fingers on 
the response keyboard, matching the stimulus 
patterns which succeeded one another as fast 
as they were matched; under automatic pac- 
ing, S returned to a rest position between 
matching successive patterns which were pre- 
sented approximately 6 sec. apart. 

Five sets of scores were taken. Response 
time and number of key presses (an error 
index) were measured in both automatic pac- 
ing and self-pacing. In addition, latencies 
were measured in the automatic-paced pro- 
cedure. 

The following results were obtained: 

1. With the self-paced procedure, response 
times were 10% to 15% greater when the 
stimulus and response units were on opposite 
sides of the S than for the optimal arrange- 
ment where both units were in front of S. 
The corresponding increase for automatic 
pacing was 30% to 40%. 

2. For automatic pacing, half of the de- 
crease in efficiency arose in the manipulatory 
process at the keyboard. The other half was 
associated with the additional movements 
necessary in the less efficient treatments. 

3. No significant differences in errors were 
observed among the various treatments. 

4. Position of the response keyboard ex- 
erted a significant effect on all three time 
measures, the centered position being pre- 
ferred, and the left position giving poorest 
results. For automatic pacing, the position 
of the stimulus panel and its interaction with 
the response keyboard were also significant 
factors, the front position being best and the 
right position poorest. 

5. For each time measure, the different 


141 


treatments showed considerable separation 
when tested with the Tukey gap test. Gen- 
erally speaking, placement of response key- 
board was more important than location of 
the display. 

The present results are contrasted with in- 
crease in response time as great as 1500% 
obtained in the previous experiments of this 
series where the effects of interfering with 
natural angular and linear correspondences of 
individual stimulus and response elements 
were investigated. 


Received June 14, 1955. 


References 


. Barnes, R. M. Motion and time study. 
Ed.) New York: Wiley, 1949. 

. Chapanis, A., Garner, W. R., & Morgan, C. T. 
Applied experimental psychology. New York: 
Wiley, 1949. 

3. Fitts, P. M. A study of location discrimination 
ability. In P. M. Fitts (Ed.), Psychological 
research on equipment design. Washington: 
U. S. Government Printing Office, 1947. Pp. 
207-217. (AAF Aviat. Psychol. Program Res. 
Rep. No. 19.) 

. Fitts, P. M. Engineering psychology and equip- 
ment design. In S. S. Stevens (Ed.), Hand- 
book of experimental psychology. New York: 
Wiley, 1951. Pp. 1287-1340. 

. Fitts, P. M., & Seeger, C. M. S-R compati- 
bility: spatial characteristics of stimulus and 
response codes. J. exp. Psychol., 1953, 46, 
199-210. 2 

. Morin, R. E., & Grant, D. A. Learning and per- 
formance on a key-pressing task as a function 
of the degree of stimulus-response correspond- 
ence. J. exp. Psychol., 1955, 49, 39-47. 

. Nystrom, C. O., & Grant, D. A. Performance 
on a key-pressing task as a function of the 
angular correspondence between stimulus and 
response elements. Percept. Mot. Skills, 1955, 
1, in press. 

. Scales, Edyth M., & Chapanis, A. The effect on 
performance of tilting the toll-operator’s key- 
set. J. appl. Psychol., 1954, 38, 452-456. 

. Stellar, E. Human factors in panel design. In 
Human factors in undersea warfare. Wash- 
ington: National Research Council, 1949. Pp. 
153-176. 

. Tukey, J. W. Comparing individual means in 
the analysis of variance. Biometrics, 1949, 5, 
99-114. 


(3rd 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


An Investigation of the Shape of Learning Curves for 
Industrial Motor Tasks * 


Jean Grove Taylor 


Johns Hopkins University Operations Research Office 


and Patricia Cain Smith 


Cornell University 


Learning curves are used in industry for a 
variety of purposes, including monetary incen- 
tives and evaluation of the learners’ progress. 
The standard or model curve is generally one 
which has been either drawn “intuitively” or 
copied from published curves, the validity of 
which has been assumed rather than empiri- 
cally established for the tasks involved. The 
purpose of this study is to determine whether 
there is a “typical” learning curve for tasks 
differing greatly in degree of complexity and 
learning time, but performed under similar in- 
centive conditions. 

Short periods of learning and temporary 
conditions of motivation have been charac- 
teristic of laboratory studies (2, 6, 10, 12, 
15). A satisfactorily representative curve 
should be based on a period of learning suffi- 
ciently long to ensure that the learning curve 
has reached a plateau. Motivation to learn 
should not be reduced at any point by restric- 
tive codes or “loose” standards which might 
introduce motivational plateaus, and group 
rather than individual data should be used 
for the sake of reliability. Field investiga- 
tions have been reported for motor tasks such 
as typewriting (4), telegraphy (5), hosiery 
looping (16), and textile-machine operations 
(3). Although a few of these studies are 
based on group data, none has taken into ac- 
count individual differences in total time to 
reach some criterion of learning. A typical 
error in analysis has been to average the pro- 
duction figures attained at the end of some 
period, e.g., the fifth week, for a group of 


1 This analysis was carried out as part of a mas- 
ter’s thesis by Jean Taylor, and was performed at 
Cornell under the direction of Patricia Smith. The 
authors wish to express their gratitude to Kurt 
Salmon Associates, and to their client who prefers 
to remain anonymous, for their cooperation in ob- 
taining the records upon which the study was based. 


142 


workers even though the time required to 
learn to a criterion varied for individual work- 
ers from ten to twenty weeks. This pro- 
cedure distorts the shape of the resultant 
curve since the rates of learning are very dif- 
ferent for the different learners at different 
percentages of total learning time. 

In this investigation, learning curves were 
compared for relatively simple tasks involving 
a fixed motor sequence with those for tasks 
involving increasing degrees of complexity 
and requiring continuous and varied adjust- 
ments. 

The Material 


The material for this investigation was ob- 
tained from a non-unionized factory in the 
South engaged chiefly in the production of 
boys’ and men’s dungarees and overalls. All 
of the standards had been set during 1940 and 
1941 by a firm of consulting engineers and re- 
mained constant throughout the period of 
this study. The piece-rate system was sup- 
plemented by a guaranteed minimum wage, 
and in most cases by a learners’ bonus plan.* 
Payment during the learning period proceeded 
stepwise from a wage in the first week equal 
to about 60 per cent of the final base rate to 
a wage of 100 per cent of the base rate in the 
last week (at 100 per cent productivity, or 
standard). The learner was furnished with 
a learners’ curve as a guide to the percentage 
of standard that should be produced on the 
specific job each week in order to obtain 
learners’ bonus wages. These curves were 
drawn by the consulting engineers without 


2 Thirteen of the 70 curves used in the study were 
obtained from operators trained after the 75-cent 
minimum wage law went into effect, which change 
was followed by temporary abandoning of the learn- 


ers’ plan. Inspection of these curves showed no sys- 
tematic differences from those of workers trained 
previous to this change. 





Shape of Learning Curves for Industrial Motor Tasks 


Table 1 


Number of Curves, Range of Weeks and Range of Percentage Productivity, Median Number of Weeks and 
Median Percentage Productivity at the First Week of Initial Plateau 





No 
Job Curves 
Main Study 
Tacking (line shaft) 
Face front pockets 
End finish (or finish band ends) 
Attach flys 
Fell in and out seams 
Attach back pocket 


Check Study 

Hem suspenders 

Tacking (individual motor) 

Bander 

Bottom hem 

Hem watch pocket and attach 
label 

Serge front pocket 


detailed analysis of the obtained curves in 
the plant. The length of the training period 
on several key jobs was determined by the 
average time to reach standard. The shape 
of the curve was, with but one exception, the 
same for all jobs. 

Special “personnel supervisors” who knew 
the operations thoroughly had the responsi- 
bility for on-the-job training of new workers. 
One group of jobs was specifically assigned to 
each supervisor. The trainers were about 
equal in experience, had been given approxi- 
maiely equal ‘and similar instruction in train- 
ing methods, and were closely supervised by 
a general supervisor. Motivational and learn- 
ing conditions were thus fairly comparable 
from task to task. 

Learning curves were obtained for 189 op- 
erators trained between the years of 1945 and 
1949. Curves were eliminated for operators 
who had left the plant before the end of the 
official learning period set for the job or soon 
thereafter, and for jobs which had fewer than 
four trainees, leaving 70 usable curves. 


Figures at First Week of Initial Plateau 
Range of 
Weeks 
to First 
Plateau 


Median 
Production 


Median Range % 
Weeks Production 


11-26 8 123 
8-27 75-120 
7-19 110 
9-2? 120 
10-26 8-120 
15-23 81— 97 


Procedure and Results 


Modified Vincent curves were constructed 
for each of six jobs, which varied widely in 
difficulty and in the extent to which the tasks 
required continuous adjustment by the worker, 
and for the composite of the six jobs. The 
jobs chosen are shown in Table 1. Each of 
the six job classifications for machine opera- 
tions is included, representing six levels of 
difficulty. The results for these jobs were 
checked for six remaining jobs, similarly dif- 
fering in difficulty.* The procedure for this 
analysis was as follows: 

Criterion of learning. The determination 
of the point at which learning terminates poses 
several problems. As Fig. 1 illustrates, the 
curves may continue to rise for a long time. 
There are, in fact, many cases in which in- 
creases continue to take place for as long as 
two years. In those cases where data are 
available, increases are apparent for three or 
four years. Verbal reports of operators on 


8 Space limits the presentation of figures to only 


three of the original thirteen. The original thesis is 
filed in the Cornell University Library. 





Jean Grove Taylor and Patricia Cain Smith 





HEM WATCH POCKET 
AND ATTACH LABEL 














. 
* 
“43 


hed 
oi 


3). END FINISH 





PRODUCTIVITY 





%o 


























20 40 





60 60 


WEEKS OF PRODUCTION 


Fic. 1. 


similar sewing jobs indicate that operators 
believe that they continue to learn through- 
out this period, both in terms of minor im- 
provements of method and improved “feel” 
for the task. After the initial plateau which 
appears after the first sharp rise of the learn- 
ing curve, further piateaus may occur re- 
peatedly throughout an individual curve, and 
quite characteristically do so, although they 
may appear for some individuals on a par- 
ticular job and not for others. For the sake 
of consistency, therefore, we chose the be- 
ginning of the first plateau, the period of 
initial leveling, as our criterion of 100 per 
cent learning time. The first of each pair of 
X’s in Fig. 1 represents that point. 

Each of two judges independently chose this 
criterion point for 43 learning curves, indicat- 
ing the first week followed by at least seven 
weeks in which production did not increase 
appreciably. No arbitrary limit was set on 
the amount of increase; instead, the total 
picture of the curve was used. The results 
showed, however, that the maximum increase 


Individual learning curves for two jobs, showing period of initial plateau. 


over the initial point was 3 per cent for the 
level period. The second X of each pair in 
Fig. 1 represents the termination of this pe- 
riod. The two judges agreed within one week 
concerning the first week of the plateau in 75 
per cent of the cases, with an average of only 
two weeks’ difference in all cases. After dis- 
cussion, the period of the initial plateau for 
each of the learning curves was finally deter- 
mined by joint agreement of the two judges. 

Plotting of individual curves. The rate of 
learning and the percentage productivity at 
the end of initial learning vary from indi- 
vidual to individual. The curves were first 
converted to a common scale on one axis on 
the basis of learning time. For each indi- 
vidual on a given job, percentage produc- 
tivity (in terms of the standard production 
set by engineers) was plotted against the per- 
centage of that individual’s total learning time 
which had been completed at a given week. 
For example, if in Week 7 the individual had 
a production record of 70 per cent of stand- 
ard production and his total learning time 





Shape of Learning Curves for Industrial Motor Tasks 





% OF ATTAINED PROFICIENCY 





COMPOSITE CURVES EQUATED 
FOR DIFFERENTIAL % 
PRODUCTIVITY 


Tacking 

Finish Band Ends 
Attach Flys 

Fell In and Out Seams 
Attach Back Pocket 
Face Front Pocket 





| q 


20 40 
%o 


Fic. 2. 


was 21 weeks, the value was plotted at 70 
per cent on the vertical axis and 3314 per cent 
on the horizontal axis. These charts are 
plotted in terms of percentage of engineering 
standard and percentage of learning time (to 
first plateau). 

The curves for any one job show no sys- 
tematic differences in shape. As shown in 
Table 1, number of weeks required to reach 
the first week of the initial plateau ranged 
from 7 to 27 weeks for individual learners on 
the six jobs. Likewise, the level of produc- 
tion reached at the first week of the initial 
plateau ranged from 75 per cent to 123 per 
cent productivity. Although there was great 





T 
60 


LEARNING TIME 


Composite prorated curves of main study. 


deviation with respect to these two variables, 
it was evident from inspection of the curves 
that the shapes of the learning curves for in- 
dividuals on the same job were relatively the 
same, thus justifying the use of a median 
curve to represent “typica!” progress for that 
task, 

Construction of composite curves for each 
job. The composite curve for each job was 
constructed from the individual curves by 
computing, at each of the 10 per cent divi- 
sions of learning time, the median of the 
percentage productivity figures from the indi- 
vidual curves. These median production fig- 
ures during the period of the initial plateau 





Jean Grove Taylor and Patricia Cain Smith 


Table 2 


Ranges of the Composite Curves at Termination of 
Each Tenth of Learning Time for the 
Main and Check Studies 





Ranges of % of Proficiency 
at Criterion 


% of 
Learning ao 
Main Study 


Check Study 
38-44 
53-65 
62-72 
66-75 
69-79 
75-84 
77-87 
80-90 
90-95 


33-51 
52-64 
62-72 
67-80 
70-85 
74-86 
79-89 
82-92 
88-95 


* Percentage of learning time where range of check study is 
not within range of main study. 


were found to vary from 86 per cent to 110 
per cent on the different jobs (see Table 1). 
To make these composite curves comparable, 
they were again prorated, this time on the 
other axis, in units representing percentages 
of the median production during the initial 
plateau, as shown in Fig. 2. The percentage 
productivity of each curve at 100 per cent 
learning time (beginning of first plateau) was 
designated as 100 per cent proficiency for 
that job, and each curve was redrawn in 
terms of percentages of that figure. 

The most striking feature of these com- 
posite job curves is their similarity. For all 
jobs, there is a noticeably high percentage of 
proficiency attained at the end of 20 per cent 
of learning time, with fairly regular increases 
thereafter. From 90 per cent to 100 per cent 
of learning time there is, in all cases, a sharp 
increase in percentage of attained proficiency. 
This is due in part to the fact that the median 
production figure at this 100 per cent point 
was not based on the prorated part of the 
curves, which usually involved some smooth- 
ing, since adjacent points were connected. 
The end of the curve was determined instead 
by actual production figures of the first week 
of the initial plateau which had been plotted 
for the individual curves. The terminal rise 
may also be due to the fact that the criterion 
of learning was defined as the first week of 
the period of initial leveling, a high point of 


the curve which was followed by no appreci- 
able increase in production. In many cases 
this was a point where considerable increase 
in production over the preceding point had 
been recorded. Although choice of this point 
distorts the shape of the curve at the end, the 
shape of the rest of the curve should not have 
been distorted by this method. 

The small differences between curves were 
compared to determine whether they were re- 
lated to task characteristics such as com- 
plexity and amount of adjustment required 
of the operator to perform a motor pattern. 
Tacking, which involves feeding a semi-auto- 
matic machine, represents a comparatively 
simple task involving a fixed motor sequence. 
A more complex motor pattern and a much 
greater amount of adjustment are required to 
perform Felling In and Out Seams. While 
the operator is felling the seams she must 
constantly feed, hold back, and readjust the 
material in her lap and arms. The remain- 
ing jobs range from the simpler to the more 
complex in both motor pattern and degree of 
adjustment required. The median job curves 
presented in Fig. 2 overlap each other and 
follow one general shape, instead of maintain- 
ing a fixed position relative to each other. In 
view of this, it would seem that the curves 
could Rot be differentiated on the basis of 
complexity of job, or amount of adjustment 
required. 

Check study. An additional study was un- 
dertaken to determine whether these results 
would stand up under the analysis of six more 
jobs. Twenty-seven curves on six additional 
jobs, comparable in all respects to those of 
the main study, were analyzed in the same 
manner. (See the last half of Table 1.) 
Ranges of the percentages of the median 
production during the initial plateau which 
were attained for each of the composite curves 
were computed at the point terminating each 
tenth of the learning period. These ranges 
for this check study differed from the ranges 
of the main study by 2 per cent or less at any 
10 per cent point. (See Table 2.) The gen- 
eral shapes of learning curves in the check 
study were also in agreement with those of 
the main study.* 


+ See footnote 3. 





Shape of Learning Curves for Industrial Motor Tasks 





> 
© 
2 
= 
2 
ve 
oO 
a 
a 
oO 
J 
< 
a § 
- 
— 
< 
re 
oO 
of 
2 
- 
(a) 
uJ 
= 





MEDIAN LEARNING CURVES 


mamm=— Check Study 


Main Study 





I T 
20 


40 





i. T 
60 80 


% LEARNING TIME 


Fic. 3 


Construction of composite curves for all 
jobs. The “typical” curve for the combina- 
tion of all six of the jobs in the main study 
was derived by taking the median percentage 
of attained proficiency for each 10 per cent 
of learning time of the six composite curves. 
This composite is shown in the solid line of 
Fig. 3. Similarly, the composite was con- 
structed for the check study, and appears as 
the broken line in the figure. Inspection of 
the two curves demonstrates their essential 
similarity. Both curves are negatively ac- 
celerated, but with a large portion of the curve 
showing a nearly linear rise, extending from 


Composite curves for all jobs, main and check studies. 


20 to 80 per cent of the total learning time. 
These curves were compared with the curves 
used as guides by the learners; they are 
clearly different in shape. 


Discussion 

There is no evidence in the present study 
to suggest that differences in the complexity 
of the task, or in the degree of adjustment 
required, influence the general shape of the 
learning curve within the group of tasks 
studied here. There are, however, two char- 
acteristics of the obtained curves which re- 
quire explanation: the nearly rectilinear rise 





148 


in the period immediately following the sharp 
initial increase, and the continuation of the 
increases for long periods of time after the 
first plateau. The rectilinear portion of the 
curve does not appear in the average curves 
obtained in previous field studies, where a 
different principle of averaging was used, al- 
though inspection of individual data sug- 
gests that it is there in some of the more 
complex tasks previously investigated (11). 
Laboratory data do not usually include this 
portion of the curve except for quite simple 
tasks. 

We propose that both the rectilinear phase 
and the continued rise may be explained in 
terms of shifts in the abilities required to 
learn and to perform the parts of a complex 
task as it is learned. Correlations between 
performance before and after learning may 
be quite low (3, 9, 11, 14). Shifts in the 


factorial composition of complex tasks have 
been shown by Fleishman and Hempel, with 
decreases in loadings of verbal comprehension 
(8), perceptual speed (7), mechanical experi- 
ence (7), and spatial relations (7, 8), and 
increases in loadings for reaction time (8), 
motor speed (7, 8), and factors specific to 


the task (7, 8), as learning progressed. Un- 
published work by one of the authors shows 
a sharp decrease in validity of visual tests for 
prediction of criteria after the learning pe- 
riod, rather than during learning, for power 
sewing-machine operators. 

On the basis of such evidence, combined 
with observation of and discussion with work- 
ers at various stages of learning on such jobs, 
we suggest that the obtained learning curves 
include three sections, in each of which dif- 
ferent abilities are of primary importance, 
and different processes are taking place. 

1. The initial sharp rise which represents 
the first 20 per cent of the learning time is 
one in which the relationships of the parts of 
the garment and the principal components of 
the task are learned. Errors of procedure can 
be observed visually by both trainer and 
learner, so that they can be corrected. Here 
spatial, verbal, and perceptual factors may 
well be important, while kinesthetic and mo- 
tor speed abilities have relatively little effect 
on performance. 


Jean Grove Taylor and Patricia Cain Smith 


2. After this period of rapid familiarization 
with the nature of the task, the curve becomes 
nearly rectilinear. Here increments of per- 
formance are much smaller, and almost equal 
to one another. Workers report that they are 
getting the “feel” of the task; this kinesthetic 
experience cannot be communicated verbally, 
checked visually, or induced except as the 
learner introduces variations in movements 
which are followed by discriminable improve- 
ment. This process must necessarily be slow. 
We believe that an ability (or a set) to at- 
tend to kinesthetic cues and to relate them to 
errors in performance is important in deter- 
mining individual differences in achievement 
at this time, and that this is the specific vari- 
ance found by Fleishman and Hempel (8), 
rather than an integration ability, as they 
suggest. Visual factors become less impor- 
tant as kinesthetic factors become dominant. 

Motor speed is also important, and begins 
to limit performance at this time. Several 
studies (e.g., 1, 13) have shown that the times 
of long “travel’’» movements show less im- 
provement with succeeding trials than do ma- 
nipulative movements. These differences also 
appear in Fleishman and Hempel’s data (8, 
p. 309). Possibly because of a higher initial 
level of practice for these grosser movements, 
they reach their maximum speeds early in the 
learning period. Much of the variance in 
total speed of performance during this period 
is probably due to the speed of these move- 
ments. It is during this section of the learn- 
ing period that performance begins to cor- 
relate fairly highly with performance after 
learning (14). 

3. In this section the curve reaches a 
plateau. Further increases are probably due 
to changes in motivation and attention, as 
well as to improvements in method of ma- 
nipulation. Motor speed, we suggest, is a ma- 
jor determiner of individual differences in 
level of performance, with sensitivity to kin- 
esthetic feedback contributing a minor por- 
tion of the variance. 

These suggestions should be checked by 
comparison of the changes in shapes of curves 
with changes in the interrelationships of tests 
and performances throughout very long learn- 
ing periods such as those obtained in the pres- 





Shape of Learning Curves for Industrial Motor Tasks 


ent investigation, and by further analysis of 
the shapes of curves obtained in laboratory 
learning situations using more complex tasks 
and extending learning periods further than 
in previous studies. Although the same curve 
applied to all of the tasks in the present 
study, one cannot assume that this obtained 
curve can be safely generalized to other learn- 
ing situations. 


Summary and Conclusions 


Seventy learning curves from operators on 
twelve power sewing-machine operations were 
analyzed. Using the period of initial plateau 
as a criterion of learning, modified Vincent 
curves were established for each job, and 
separate composite curves for each of two 
groups including half the jobs. 


1. Increases in productivity continued over 
long periods beyond the initial plateau for 
individual workers. . 

2. Differences in‘complexity and adjustive 
requirements of these tasks were not system- 
atically related to differences in shape or slope 
of the composite learning curves for each job. 


3. The composite curve based on the first 
six tasks analyzed matched very closely the 
composite curve of the remaining six tasks. 

4. One negatively accelerated curve could 
serve as the “typical” curve for all of these 
tasks. This curve showed a sharp initial rise, 
followed by a period of more gradual, nearly 
linear, increase. 


Suggestions concerning the change in the 
requirements of the task during learning were 
proposed to account for the shape of the 
curves, and the length of the period of im- 
provement in the individual curves. 


Received July 11, 1955. 


References 


1. Barnes, R. M. Motion and time study (3rd 
Ed.). New York: Wiley, 1949. Pp. 498-500. 

. Batson, W. H. Acquisition of skill. Psychol. 
Monogr., 1916, 21, No. 3 (Whole No. 91). 

. Blankenship, A. B., & Taylor, H. R. Prediction 
of vocational proficiency in three machine op- 
erations. J. appl. Psychol., 1938, 22, 518-526. 

. Book, W. F. The psychology of skill: with spe- 
cial reference to its acquisition in typewriting. 
Univer. Montana Publ. Psychol., Bull. No. 
53, 1-158. 

. Bryan, W. L., & Harter, N. Studies in the psy- 
chology and physiology of the telegraphic lan- 
guage. Psychol. Rev., 1897, 4, 27-53. 

. Elwell, J. L., & Grindley, G. C. The effect of 
knowledge of results on learning and perform- 
ance. Brit. J. Psychol., 1938, 29, 39-53. 

. Fleish.ian, E. A. & Hempel, W. E., Jr. Changes 
in factor structure of a complex psychomotor 
test as a function of practice. Psychometrika, 
1954, 19, 239-252. 

. Fleishman, E. A., & Hempel, W. E., Jr. The 
relation between abilities and improvement 
with practice in a visual discrimination re- 
action test. J. exp. Psychol., 1955, 49, 301- 
310. 

. Kornhauser, A. W. A statistical study of a group 
of specialized office workers. J. pers. Res., 
1923, 2, 103-123. 

. Krueger, W. C. F. Influence of difficulty of 
perceptual-motor task upon acceleration of 
curves of learning. J. educ. Psychol., 1947, 
38, 51-53. 

. McGehee, W. Cutting training waste. 
nel Psychol., 1948, 1, 331-340. 

. Peterson, J. Experiments in ball-tossing: the 
significance of learning curves. J. exp. Psy- 
chol., 1917, 2, 178-224. 

. Ruben, G., Trebra, P. V., & Smith, K. U. Di- 
mensional analysis of motion: III. Complexity 
of movement pattern. J. appl. Psychol., 1952, 
36, 274. 

. Smith, P. C., & Gold, R. A. Prediction of suc- 
cess from examination of performance during 
the training period. J. appl. Psychol., 1956, 
40, 83-86. 

. Swift, E. J. Studies in the psychology and 
physiology of learning. Amer. J. Psychol., 
1903, 14, 201-251. 

. Tiffin, J. Industrial psychology. New York: 
Prentice-Hall, 1947. 


Person- 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


The Effect of Lack of Information on the Undecided 
Response in Attitude Surveys 


Marvin D. Dunnette, Walter H. Uphoff, and Merriam Aylward ':* 


Industrial Relations Center, University of Minnesota 


The Industrial Relations Center has de- 
veloped a Union Attitude Questionnaire to 
measure the attitudes of union members to- 
ward unionism in general and toward various 
aspects of their local and national unions (1, 
2,3). The questionnairé consists of 77 state- 
ments to which respondents indicate various 
degrees of agreement or disagreement. Scores 
are reported by one or both of two methods: 
(a) they are derived by the standard Likert 
scoring technique; and, () they may con- 
sist of percentages responding favorably, un- 
favorably, or undecided to each statement. 
The questionnaire may be scored for atti- 
tudes toward the following six areas: union- 
ism in general, local union in general, local 
union policies and practices, local union offi- 
cers, local union administration, and the na- 
tional union. 

During development of the questionnaire, it 
became apparent that a few items were draw- 
ing an unduly large proportion of undecided 
responses. Although the large majority of 
items received fewer than 20 per cent unde- 
cided responses, a few ranged as high as 45 
or 50 per cent. Such large proportions of un- 
decided responses rendered interpretation of 
attitude scores difficult. It was decided, 
therefore, to investigate the basis for the 
large undecided response given to certain of 
the items. This article describes the design 
of the study and the results obtained. 


Method 


During the developmental stages of the Union 
Attitudes Questionnaire, a pool of 121 items was ad- 
ministered to 821 persons belonging to nine different 


1Funds for this research were provided by the 
Graduate School, University of Minnesota. Aid and 
advice from the following persons are gratefully ac- 
knowledged: Dale Yoder, director, and Herbert G. 
Heneman, Jr., assistant director of the Industrial Re- 
lations Center, Donald G. Paterson, Robert Jones, 
Wayne Kirchner, and Lois Boggs. 

“The first and third authors are now with the 
Minnesota Mining & Manufacturing Co. 


union groups. Many items comprising the current 
questionnaire received a large proportion of unde- 
cided responses. These items are shown in Table 1. 

It is widely recognized that the response undecided 
may reflect one or more of the following situations: 


1. The respondent may actually be neutral with 
respect to the statement being responded to. 

2. The item may be ambiguous, the respondent 
choosing the undecided category because of inability 
to understand the question or to know “what the 
statement is getting at.” 

3. The respondent may be antagonistic toward the 
whole procedure of completing an attitude scale. 
One way of venting his antagonism and unwilling- 
ness to cooperate would be through a_ wholesale 
checking of the undecided responses. 

4. The respondent may feel the need to qualify 
his answers. In other words, he may see both sides 
of the question; thus, an undecided response may be 
an effort to “straddle the fence.” 

5. Finally, a respondent may lack specific infor- 
mation or facts necessary to the formation of an 
attitude. He just doesn’t know enough about the 
statement to answer wisely; and, as a consequence, 
he gives an undecided response. 


Occurrence of each of these five possibilities is 
plausible in the use of attitude questionnaires. De- 
velopment and administration of the scale and analy- 
sis of responses, however, provide definite safeguards 
against occurrence of at least two of the above al- 
ternatives. For example, several techniques ranging 
from judgment by experts to item-analysis methods 
are commonly employed to identify ambiguous items. 
The final form of a scientifically developed attitude 
questionnaire will ordinarily include few, if any, am- 
biguous statements. Since development of the IRC 
Union Attitude Questionnaire included several meth- 
ods directed toward reducing item ambiguity, it is 
not likely that a significant proportion of undecided 
responses were due to this factor. 

It is equally unlikely that antagonism on the part 
of the respondents has played a significant role in 
the incidence of undecided responses. This conclu- 
sion is based on several lines of evidence. First, 
every effort was made during administration of the 
questionnaire to secure the understanding and co- 
operation of respondents. The purpose of the sur- 
vey was explained, and the confidential nature of 
the returns was emphasized. So far, there has been 
little evidence of open antagonism on the part of 
any survey respondent. When it appeared in suffi- 
cient degree to appear to invalidate the responses, 
the questionnaire was discarded. Secondly, com- 





Lack of Information and Undecided Response 


Table 1 








oO 
/o 
Undecided 


20 Our union president lets a few who like to talk take too much time at our meetings. 


Items Receiving Undecided Responses from 20 Per Cent or More of Persons Belonging to Nine Different Unions 





Item 


21 Officers of my union are chosen because they are real leaders. 


22 


Every labor union should be required to take out a license from the U. S. Government. 


23 Our national union provides the necessary facts and helps at negotiation time. 
24 My union does not keep careful enough records of all money taken in and spent. 
24 My union spends too much time and money on political action. 
Our union paper gives us only one side of an issue. 
If you read it in the union paper you know you are getting the facts. 
Our national union takes its share of our dues but gives us very little help. 
My union officers spend too much time on things that are of no concern to my union. 
Our national union interferes too much in our local affairs. 
There isn’t a better union than the one I belong to. 
My union looks after labor’s interests in the city council and in the state legislature. 
My union does not teach us enough labor history. 
Our national union exercises too much control over the affairs of our local. 
We give our delegates too much money to spend when they go to conventions. 
Our union officers know how to get the members to do things for the union. 


It is practically impossible to elect different officers in our national union. 


The officers of my national union are paid too much. 


There is not much “rhyme or reason” to the way our union votes to contribute to the various appeals 


for money that come to it. 


We don’t get enough help for our union educational program from the national union. 


pleted questionnaires were examined with a view to- 
ward identifying those with an unduly large number 
of undecided responses. Based on these examina- 
tions, it appears that the prupensity to choose the 
undecided response differs little from individual to 
individual. It should be noted, finally, that any 
widespread effect leading to wholesale choosing of 
the undecided response would show up in the form 
of a general increase in the percentages of undecided 
responses for all items in the questionnaire. Actually, 
differences among items are far greater than differ- 
ences among individuals. Thus, as has been con- 
cluded above, the role of respondent antagonism in 
leading to undecided responses probably has been 
minor. 

It appears, then, that the major determiners of 
the undecided responses are lack of information 
and actual neutrality or “fence-straddling” attitudes. 
Examination ef the content of items in Table 1 
suggests rather definitely that special knowledge may 
be required in order to form attitudes toward the 
areas considered. It appears that union members 
feel a definite lack of information concerning sev- 
eral aspects of their union. Note especially that 
seven of the eight items comprising the National 
Union subscale are among the statements in Table 1. 

In order to investigate the relative proportions of 
undecided responses stemming from actual neutrality 
and from lack of information, a sixth alternative 
was added to the items of the Union Attitude Ques- 


tionnaire. This alternative read: I DON’T KNOW 
ENOUGH ABOUT THIS TO ANSWER. 

It was reasoned that persons lacking sufficient 
knowledge to have formed an attitude would check 
this response and that neutral persons would con- 
tinue to check undecided. 

Samples of persons belonging to four different un- 
ions (autoworkers, office workers, retail clerks, and 
sheetmetal workers) were randomly separated into 
two groups. One group received the standard five- 
response questionnaire. The other group received 
the questionnaire with the sixth response added. 
Completed questionnaires were received from 214 
persons in the five-response group (Group I) and 
216 persons in the six-response group (Group II). 


Results 


The study was designed to answer the fol- 
lowing questions: 


1. Does the presence of the “I don’t know” 
response alter the distributions of favorable 
and unfavorable responses? In other words, 
does the “I don’t know” alternative draw re- 
sponses only from the undecided group, or 
does it draw additional responses from per- 
sons with definite attitudes? 





152 


2. Is the proportion of undecided responses 
to items of the questionnaire reduced substan- 
tially by providing the sixth alternative? 

The first question was answered by com- 
paring the proportions of Group I and Group 
II who responded to the various alternatives 
on each of the items. In order to determine 
whether or not “I don’t know” responses were 
drawn entirely from the undecided group, the 
undecided and “I don’t know” responses were 
totaled for Group II and compared with the 
proportion of undecided responses obtained 
from Group I. A total of 385. (77 x 5) com- 
parisons was made. Differences were tested 
for significance by using Zubin’s tables (5). 
The distributions of differences and the num- 
ber of significant differences are shown in 
Table 2. 

Only 27 differences were significant at or 
beyond the 5% level. The number to be ex- 
pected by chance alone is about 19 (385 x 
.05); thus, only eight of the differences can 
be attributed to nonchance factors. This is 
striking evidence that persons who choose the 
“T don’t know” response are drawn almost 
entirely from the group who would otherwise 
choose undecided. It appears that the inclu- 
sion of a sixth-response alternative—‘I don’t 
know enough about this to answer’—has no 
effect on the responses of persons who have 
formed favorable or unfavorable attitudes. 

Data presented in Table 3 bear on the 
second research question. It is clear that the 


Marvin D. Dunnette, Walter H. 


Uphoff, and Merriam Aylward 


proportion of undecided responses has been 
reduced substantially by inclusion of the 
sixth response. Evidently, a large segment of 
the undecided group is made up of persons 
who do not have sufficient information to 
form an opinion. 

Discussion 

It has been argued in this paper that am- 
biguity of items and antagonism on the part 
of questionnaire respondents probably play 
minor roles in the incidence of undecided re- 
sponses on a well-developed, well-adminis- 
tered questionnaire. Experimental evidence 
does suggest, however, that an important seg- 
ment of the undecided group consists of per- 
sons who don’t know enough about the state- 
ment to answer. 

An important question remains—one which 
was not investigated in this study. It bears 
on the interpretation to be given to the seg- 
ment of undecided responses coming from 
persons who feel they do know enough about 
the question to answer. Since other alterna- 
tives have been excluded, it is probable that 
such persons are truly neutral in the sense 
that they have pondered the pros and cons of 
a question and have arrived at a point some- 
where between favorableness and unfavorable- 
ness. 

Evidence in support of this contention 
comes from a study by Rosen and Rosen (4). 
For each item of an attitude questionnaire 


Table 2 


Distributions of Differences Between Percentages in Corresponding Response 


Categories of Groups I and II 





Differences Between 
Percentages Choosing 
Various Responses in 

Groups I and II 


Strongly 


49 25 
28 


Median difference** 
Number significant at 5% level 
Number significant at 1% level 


Unfavorable Unfavorable Undecided* 


Number of Items 


Strongly 


Favorable Favorable 


20 24 27 
47 32 41 
19 
2 


8 
0 





* For Group II, responses for undecided and I don't know enough about this to answer were summed and compared with the 


undecided response in Group I. 


** In each case, the proportion in Group II was subtracted from the proportion in the corresponding category of Group I. 





Lack of Information and Undecided Response 


Table 3 


Percentage of Undecided Responses Given by 
Groups I and II 


Number of Items 


ee Group II 

@ Undecided (6 responses 
0-4 1 4 

5-9 5 22 

10-14 28 

15-19 13 

20-24 8 

25-29 

30-34 

35-39 

40 44 

45-49 

50-54 

55-59 


60-64 


Group I 
(5 responses) 


Median Value 


(measuring attitudes of members toward their 
unions), they asked respondents to indicate: 
(a) the extent to which the practice should or 
should not be followed, (6) their perception 
of the extent to which it actually was being 
followed, and (c) their general level of satis- 
faction with the current practice as it was 
perceived. It was reasoned that for satisfied 
persons, perceptions would correspond closely 
with desires. On the other hand, dissatisfac- 
tion would stem from a low correspondence 
between perceptions and desires, and unde- 
cided (or neutral) persons would exhibit a 
degree of correspondence somewhere between 
that of satisfied and dissatisfied persons. 

The authors’ hypothesis was supported on 
each of the 27 items of their questionnaire. 
Differences in the average degree of corre- 
spondence between desires and perceptions of 
satisfied and dissatisfied persons were sta- 
tistically significant on all but two items; 
the average degree of correspondence be- 


153 


tween desires and perceptions shown by un- 
decided respondents was between that of the 
other two groups on all items of the ques- 
tionnaire. 

Results from the present study and from 
the Rosens’ study lead to the following con- 
clusion: Undecided responses to items of a 
scientifically developed, professionally admin- 
istered questionnaire stem from two major 
sources: (a) persons who lack sufficient in- 
formation on the point in question to form 
an attitude or answer wisely, and, (b) per- 
sons who do have knowledge of the point in 
question, have considered the pros and cons, 
and have arrived at a neutral or “undecided” 
point. 

It would appear that the “I don’t know” 
response would be a wise addition to attitude 
surveys. This is especially true when items 
of the survey require information (e.g., knowl- 
edge of the activities of the national union) 
which may not be commonly held by the 
survey respondents. 


Received July 11, 1955. 


References 


1. Aylward, Merriam, Uphoff, W. H., Kirchner, 
W. K., & Dunnette, M. D. Development and 
validation of a union attitude questionnaire. 
Mimeo. Release No. 7. Minneapolis: Indus- 
trial Relations Center, Univer. of Minnesota, 
June, 1955. 

. Dunnette, M. D., & Uphoff, W. H. Union atti- 
tudes and membership participation. Busi- 
ness News Notes, School of Business Adminis- 
tration, 1955, No. 24. 

3. Dunnette, M. D., Kirchner, W. K., & Uphoff, 
W.H. Development and validation of a un- 
ion attitude questionnaire. J. pers. Admin., 
in press. 

. Rosen, H., & Rosen, Ruth. The validity of “un- 
decided” answers in questionnaire responses. 
J. appl. Psychol., 1955, 39, 178-181. 

. Zubin, J. Nomographs for determining the sig- 
nificance of the difference between the fre- 
quency of events in two contrasted series or 
groups. J. Amer. statist. Ass., 1939, 34, 539- 
544. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


The Prediction of Attrition in Trade School Courses 


C. H. Patterson ':? 


Veterans Administration Regional Office,? St. Paul, Minnesota 


The study of factors related to success in 
trade school training has received much less 
attention than the prediction of success in 
college training. Nevertheless a considerable 
number of studies does exist. There have been 
reviewed elsewhere (5), and will not be men- 
tioned here. Most of these studies are inade- 
quate in one or more respects, so that there is 
a need for studies utilizing adequate samples, 
an acceptable design including necessary con- 
trols, and appropriate statistical analyses the 
assumptions of which are met by the data. 
The present report is of a study attempting 
to meet these requirements. 

The institution and sample. . The school 
studied is a large, privately endowed, non- 
profit school in a large city in the Middle 
West. A scattering of students is drawn from 
the entire country, but almost all are from 
Minnesota. Applicants must be 16 years of 
age or over with a general educational re- 
quirement stated as follows: “Educational 
background sufficient to indicate successful 
progress in the training program is the basic 
requirement for admission.” This is inter- 
preted to mean ordinarily the completion of 
the eighth grade, though a few students are 
admitted with completion of seven years of 
education. Very few applicants are denied 
admission because of poor school records, and 
a few are discouraged, but not denied admis- 
sion if they persist. 

The students in this study include those 
entering the school during the first three 
monthly enrollment dates of the 1953-54 
school year. The courses and the distribu- 
tion of the students among them are shown 


1 The writer is indebted to the staff of the institu- 
tion studied, who made the investigation possible, 
but who prefer to remain anonymous. 

2 Now at the University of Illinois. 

3 Although this article has been approved for pub- 
lication by the Veterans Administration, the conclu- 
sions reached are those of the author and do not 
necessarily reflect the position of the Veterans Ad- 
ministration. 


in Table 1 (Total Group). Over 80 per cent 
of all students entering these courses were in- 
cluded; those not included had incomplete 
data or reported too late to take the tests. 
The mean age of the sample of 350 students 
was 21.99, with an SD of 4.32. Mean educa- 
tion completed was 11.54, with an SD of 1.38. 

The sample used in the present study was 
compared to students tested during the re- 
mainder of the school year and the beginning 
of the 1954—-55 school year. The three groups 
were similar in age, education, and test scores. 

Measuring instruments used. The follow- 
ing tests were used: the Bennett Test of Me- 
chanical Comprehension, Form AA (1), the 
Revised Minnesota Paper Form Board, Form 
MA (3), the Army General Classification 
Test, First Civilian Edition, Form AM (7). 
Other nonability factors were collected by 
means of a questionnaire which included the 
areas of personal data (age, marital status, 
dependents, veteran status, and length. of 
time elapsed since the decision to enter the 
school), socioeconomic background (father’s 
occupation, father’s education, and urban or 
rural background), educational background 
(education completed, attitude toward school, 
subject liked best, number of shop, mathe- 
matics, and science courses reported taken, 
and interval since last school attendance), 
and previous vocational training and experi- 
ence (work experience, previous training, and 
previous experience in the occupation se- 
lected). 

The criterion. The criterion used was the 
dichotomy of completion or noncompletion of 
the first six months of the course. This is an 
objective, practically significant criterion to 
attempt to predict. Approximately 40 per 
cent of those students entering the courses 
studied fail to complete more than six months 
of training. The proportion dropping after 
six months decreases rapidly; furthermore 
some of those leaving after completion of sev- 
eral months of the course are not failures, but 


154 





Prediction of Attrition in Trade School Courses 


Table 1 


Pass-Fail Status of Students in Groups I and II and the Total Group by Courses 


Course 


Air Conditioning—General 
Air Conditioning” Refrigeration 
Automobile—General 

Electrical 
Building Construction 


Automobile 
Drafting and Estimation 
Building Construction—Carpentry 
Electrical—General 

General Mechanics 

Highway, RR, and Municipal Construction 
Machine Shop 

Mechanical Drafting 

Printing 

Radio and Electronics 


Welding 
Total 


leave to take jobs or enter apprenticeship 
training in the trade. 

It is recognized that such a criterion is a 
complex one. Students leave school for many 
reasons in addition to inability to handle the 
work, including interest, motivation, person- 
ality factors, etc. Some leave for reasons be- 
yond their control, such as personal illness, 
illness or death in the family, financial diffi- 
culty, or being drafted into service. An at- 
tempt was made to identify such students, 
from the reasons given by the students for 
leaving. These are probably unreliable. 
Those students giving one of the reasons listed 
above, with the exception of financial diffi- 
culty, were identified, and if they were also 
doing satisfactory work they were not in- 
cluded in the drop-out category. Twenty- 
five students in all were eliminated on this 
basis from the total group of 375 students 
upon whom complete data were available, 
leaving 350 to constitute the sample. 


Design and procedure. The tests and the ques- 
tionnaire were administered prior to the beginning 
of classes. Six months later each student was classi- 
fied as (a) failure—not in school (N = 126), (b) 
successful—still in school (N = 224), or (c) neither 
—left school apparently for reasons beyond his con- 
trol (N= 25). Those in the last category were dis- 
carded from the study. The remaining 350 consti- 


Group I Group II 


Pass Fail Fail 


Pass 


5 


— 
Aw 


tN 


an 
CO WWH UW WN 


~1— Ww ri Ww 
Nm 


_ 


+ 


&eNmNwWwen vu 


-— ‘ 
— se Uhm hm ww 
— 
mem OW eH N TO NW W 


— me ND t 
wr wun w 


_ 
~~ w oe 
— 
~ 
~] 
= 
= 


t 
tN 
NR 
t 
+ 


— 
— 
A) 
a 
~ 
tN 
nN 
+ 


tuting the sample were split into random halves, by 
course. Each group was used both as an experi- 
mental and a control group in the procedure of 
double cross validation, and are designated as Group 
I and Group II in the tables. 

The questionnaire items were analyzed by means 
of x*. The test-criterion relationships were studied 
by means of biserial correlation and the linear dis- 
criminant function. 

The two random halves of the total group were 
compared in all measured characteristics. They did 
not differ in any respect except the correlations of 
the individual! tests with the criterion. The biserial 
correlations of the Bennett, the AGCT Blocks, and 
the AGCT total scores were significantly lower in 
Group I. This is an important factor in the results 
and in the cross validation. The reasons for these 
differences are unknown. They suggest that great 
caution should be used in accepting a second sample 
as equivalent in cross validation. 


Results: Background factors. The results 
for the 18 background and socioeconomical 
factors studied are relatively meager. The 
individual most likely to persist in trade 
school training is between 20 and 30 years of 
age, has had several shop, science, and math 
courses in school, and has had some work 
experience, but not necessarily in the field in 
which he is training. Although number of 
years of education completed was not sig- 
nificant, an analysis comparing high school 
graduates with nongraduates indicated that 





C. H. Patterson 


Table 2 


Biserial Correlations of AGCT, Bennett, and Paper Form Board Tests with Pass-Fail Criterion for 





Group I, Group II, and the Total Group 





Group I 


Mean, Meany 
(N=109) (V=66) rois** 


* 


Test 


Mean, 





AGCT—Vocabulary 
Blocks 
Arithmetic 
Total 

Bennett 

RMPFB 


30.56 27.47 = .297** 
31.02 29.76 104 
36.54 34.02 .257** 
97.66 91.09 .266** 
44.64 41.74 .213* 
47.45 43.30 .261** 


* Significant at the .05 level. 
** Significant at the .01 level. 


Group II Total Group 


Meany; Mean, Meany 


vis (N =224) ( N=126) rvis*** 
.366** 26.71 
5os"* 28.32 
438** 33.15 
535** 88.06 

40.06 


.509** 
Rs te 42.02 


30.22 
31.79 
36.55 
98.35 
45.04 
47.02 


25.88 
26.73 
32.20 
84.73 
38.22 
40.62 


328** 
294" 
343** 
307** 
358** 

342** 


*** The standard error of an rpis of .00 (assuming the null hypothesis) is .096 for Group I, .097 for Group II, and .069 for the 


Total Group. 


high school graduates were more likely to per- 
sist in training. 

Results: Tests. The biserial correlations of 
the aptitude tests and criterion are shown in 
Table 2. An indication of the overlapping 
on the tests between the criterion groups is 
the fact that, in the total group, cutting scores 
on each test set to eliminate from 46 to 50 
per cent of the failures would eliminate from 
21 to 29 per cent of the successful students. 

Test scores were combined by means of the 
linear discriminant function. Analysis of the 
data indicated that they satisfied the assump- 
tions of multivariate normality and equality 
of the variance-covariance matrices sufficiently 
well to be suitable for the application of this 


Table 3 


Significance of Increase in D? Resulting from 
the Addition of Tests 





Total 

Group I~ Group II Group 

p p p 

<.01 <.01 

2-1 > .05 <.01 
3-(1+2) <.01 05 
4-(14+2+3) > .05 <.01 
5-(1+2+3-+44) >.05 >.05 
>.05 <.01 

<.01 <.01 

>.05 <.01 

>.05 > .05 


<.01 
<.01 
<.01 
>.05 
>.05 
<.01 
<.01 
<.01 
> .05 





*1 = RMPFB,2 = Bennett, 3 = AGCT—Vocabulary,4 = 
AGCT—Blocks, § = AGCT—Arithmetic, 6 = AGCT—Total. 


technique. The application of Hotelling’s gen- 
eralized T test indicated that the five test or 
subtest scores taken together discriminated 
significantly between the two criterion groups 
in the total group. 

Three combinations of the five test scores 
were analyzed for significance of discrimina- 
tion in the two groups and the total sample. 
Table 3 indicates the levels of significance of 
the various tests and combinations as meas- 
ured by the D® statistic, which indicates the 
distance by which the two criterion groups 
are separated. It is apparent that using the 
part scores of the AGCT does not result in 
significantly greater discrimination than using 
the total score alone. The best combination 
of scores is the Paper Form Board, Bennett, 
and AGCT total. Although in Group I the 
addition of the latter two does not increase 
the discrimination significantly over that ob- 
tained with the Paper Form Board alone, 
they were retained in the equations for pur- 
poses of cross validation. 

The linear discriminant weights are as fol- 
lows: 

Group I L = .034955X; + .013808X. + .015998X4; 
Group II L = .029408X; + .074183X» + .043446Xs; 
Total Group L = .033525X, + .039706X2 + .026995Xz.. 


Equivalent multiple point-biserial R’s ob- 
tained by a method described by Fisher (2) 
are .36, .51, and .45, respectively, for the 
three equations. 

The weights given in the preceding equa- 





Prediction of Attrition in Trade School Courses 


tions were used, with a correction for relative 
proportions in the criterion categories, to ob- 
tain criterion scores for each category and a 
criterion discriminatory score for each group. 
These weights and criterion discriminatory 
scores for Group I and Group If were used in 
cross validating on the group not used in ob- 
taining the weights. The results of this dou- 
ble cross validation are shown in Table 4. It 
is apparent that the weights obtained on 
Group II do not hold up when applied to 
Group I, while those obtained on Group I do 
succeed in discriminating significantly when 
applied to Group II. The ¢ test of the sig- 
nificance of the difference between those cor- 
rectly classified and the number to be ex- 
pected to be correctly classified by chance— 
a more stringent test of the results than y’— 
is 3.90 for the latter cross validation, which 
is significant beyond the .001 level. 

These results, which are a consequence of 
the differences in validities of the tests in the 
two groups. are equivocal as to the value of 
the tests in selection. They indicate the need 
for further cross validation. 

Further cross validation. Additional cross 
validation was possible, since criterion data 
became available on 302 students tested dur- 
ing the latter part of the 1953-54 school year. 
Four of these students were eliminated as 
having left for reasons beyond their control. 
The mean age of this sample was 22.86 (SD 
3.47) and mean years of education completed 
was 11.52 (SD 1.38). 

Discriminant weights and discriminant cri- 
terion scores determined upon the total group 
of 350 cases previously studied were applied 
to these 298 new cases, 170 of whom were 
successful and 128 failures, with the results 
shown in Table 5, where the new sample is 
designated as Group B and the previous group 
of 350 cases is designated as Group A. 
Forty-three per cent of Group B_ failed, 
compared to 36 per cent of Group A, which 
accounts in part at least for the relatively 
large proportion of failures who were pre- 
dicted as successful. The ¢ ratio of the ob- 
tained versus chance accuracy of prediction 
is 3.11, significant beyond the .001 level. 

Discussion and summary. The results ob- 
tained indicate that it is possible to predict, 


Table 4 


Results of Cross Validation Using Group I Weights on 
Group II and Group II Weights on Group I 





Predicted Classification 
Group IT Using 
Group I Weights* 


Group I Using 
Actual Group IT Weights** 
Classi- sineatiatigea ; 


fication Pass Fail Total 


Pass Fail Total 


Pass 104 11 115 92 17 109 
Fail 35 25 49 17 66 
Total 139 36 ‘ 141 34 175 


*y? = 2.71, p > .0S < .10. 
yy? = 24.87, p < .001. 





with significantly greater than chance suc- 
cess, persistence in trade school training for 
at least six months, by use of the Minnesota 
Revised Paper Form Board Test, the Bennett 
Test of Mechanical Comprehension, and the 
Army General Classification Test. 

The degree of accuracy of prediction 
achieved leaves much to be desired. A cut- 
ting score on the composite criterion score set 
to eliminate 50 per cent of the drop-outs 
would also eliminate 30 per cent of the suc- 
cessful. Cutting scores could be set to elimi- 
nate 29 per cent of the drop-outs while re- 
jecting 9 per cent of the successful, or to 
eliminate 4 per cent of the drop-outs without 
rejecting any of the successful. 

The level of prediction achieved is reduced 
by two factors which must be recognized. 
The first is the fact that the criterion is 
admittedly complex. Students drop out for 
many reasons, including ability, interest, and 
motivation, as well as for reasons beyond 
their control. An analysis of the grades of 
the 126 drop-outs in Group A indicated that 


Table 5 


Results of Cross Validation Using Group A 
(Total Group) Weights on Group B 


Predicted Classification* 
Actual —— - 
Classification Pass Fail Total 
Pass 156 14 170 
Fail 92 36 128 
Total 248 50 298 


*¥y? = 20.69, Pel 








158 


3 were doing above average work at the time 
they left school, and 23 were doing average 
work. For Group B, 13 were above average, 
and 22 average. 

A second factor affecting the results is that 
of differences among the courses studied. In 
the present study, 15 courses were grouped 
together, since there were not enough stu- 
dents in each course to study them separately. 
This study produced evidence from the analy- 
sis of variance of the course means that there 
are differences on the tests among students 
entering the various courses (4). The course 
means on these tests correlate significantly 
(p’s=.59 to .80) with rankings of the 
courses by difficulty level by two competent 
judges, who correlated .93 in their rankings. 
There is further evidence that, although stu- 
dents thus appear to select themselves to some 
extent in terms of the difficulty levels of the 
courses, the drop-out ratio still may vary sig- 
nificantly among the courses, and that this 
ratio may be related positively to the diffi- 
culty levels of the courses. 

These considerations indicate that it would 
be desirable to study further the relationships 


C. H. Patterson 


between these tests and a modified criterion 
better reflecting achievement in the school, 
treating courses, or groups of homogeneous 
courses, separately. 


Received July 12, 1955. 


References 


. Bennett, G. K. Test of Mechanical Comprchen- 
sion, Form AA. Manual. New York: Psy- 
chological Corporation, 1951. 

. Fisher, R. A. Statistical methods for research 
workers. (11th Ed.) London: Oliver & 
Boyd, 1950. 

. Likert, R., & Quasha, W. H. The Revised Min- 
nesota Paper Form Board Test Manual. New 
York: Psychological Corporation, 1948. 

. Otterness, W. B., Patterson, C. H., Johnson, R. H., 
& Peterson, L. R. Trade school norms for 
some commonly used tests. J. appl. Psychol. 
1956, 40, 57-60. 

. Patterson, C. H. Tests and background factors 
related to drop-outs in an industrial institute. 
Unpublished doctor’s dissertation, Univer. of 
Minnesota, 1955. 

. Rao, C. R. Advanced statistical methods in bio- 
metric research. New York: Wiley, 1952. 

. Science Research Associates. Examiner Manual 
for the Army General Classification Test, 
First Civilian Edition. Revised. Chicago: 
Science Research Associates, 1948. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


The Selection of Graduate Students in Public Health 
Education 


Richard P. Barthol 


Pennsylvania State College 


and Barbara A. Kirk 


University of California Counseling Center, Berkeley 


This study was undertaken to determine 
whether tests could improve significantly the 
procedures used to select students for admis- 
sion to a graduate program leading to an 
M.P.H. degree in Public Health Education, 
and to specify how these tests should be used. 

In 1952 students were admitted to this pro- 
gram on the basis of academic records, refer- 
ences, and other historical data. A selection 
and validation program was started at that 
time by the University Counseling Center and 
covers three separate classes in public health 
education, limited to those students who were 
of English speaking heritage. The first two 
classes, Class A (N equals 20) and Class B 
(N equals 11), were selected by members of 
the School faculty without access to the test 
data. These students were tested at the time 
of admission, and the test data were filed. 
Class C (N equals 11) was selected partially 
by the test results. Formal analysis of the 
test data was not started until Class C had 
completed its academic year. 


The Test Battery and Criteria 

The academic preparation and subsequent 
work of the public health educator were in- 
vestigated. A test which had not yet been 
published, the Concept Mastery, was selected 
as a predictor of academic success.’ A stable 
personality structure plus a genuine interest 
in welfare and in working with people were 
indicated. The Strong Vocational Interest 
Blank and the Minnesota Multiphasic Per- 
sonality Inventory were selected to meet 
these needs. 


1 This test was developed by Dr. Lewis M. Ter- 
man and associates for follow-up of his gifted group. 
It has been used with advanced graduate students, 
and will be published by the Psychological Corpora- 
tion. 


For criteria, rankings of academic progress 
by members of the faculty were used. Two 
subrankings proved useful—organizational 
ability and facility with interpersonal rela- 
tions. (Grades were not used because the 
range of grades at the graduate level was so 
restricted.) The students in each class were 
ranked independently by three raters at the 
close of each academic year. In addition, 
students in Classes A and B were evaluated 
after they had been working out in the field. 
These latter rankings were based partially on 
the reports of the field supervisors, and par- 
tially on personal observation.” 

Scores on an achievement test, the Ameri- 
can Public Health Association Examination, 
were available. Classes A and B were given 
the standard examination at the beginning 
and the end of the academic program. Class 
C was given an abridged version of this test. 
The test consists of 300 multiple-choice ques- 
tions covering the fields in public health edu- 
cation. The abridged version, 110 questions, 
does not have the national norms available 
for the standard examination. It must be 
acknowledged that the APHA Examination 
was not considered a predictor of future aca- 
demic success, but only a measure of past 
academic achievement. The pre- and post- 
tests had been used by the School to assist 
the staff in planning and evaluating the gradu- 
ate program. Only the pretest was used in 
this study. 


Methods and Procedure 


Rank-order correlation coefficients were obtained 
for the Concept Mastery and APHA examinations 


2 Dr. Dorothy B. Nyswander and Dr. William 
Griffiths, who initiated the study, and who had the 
foresight and patience to allow it to be done in the 
fashion indicated, did most of the rankings. Miss 
Sarah Mazelis also contributed her time, experience, 
and understanding to this project. 


159 





160 


with the criteria. Cut-off scores were obtained for 
each. The mean score of the APHA norm group 
was used for that test. A raw score of 55 was se- 
lected for the Concept Mastery based on rankings 
of Class A. ; 

A far more difficult task was the determination of 
selection scores and profiles for the Strong and 
MMPI. First, a clinical analysis of the results for 
each student was made by staff members of the 
Counseling Center skilled in this technique, and each 
student was placed in an accepted or rejected group. 
The Strong and MMPI scores for each student were 
color coded and posted to a master profile sheet 
which was examined for patternings to see if a spe- 
cific score or pattern would predict success or failure. 

The hypotheses developed for one class could be 
immediately checked with the other two classes. 
Two statistical techniques were used. One was to 
use a test as a screening device and then obtain a 
rank-order correlation for the remaining students 
between another test and the criteria. The other 
method was to divide the class into two groups at 
the cutting point of the test and then use the Mann- 
Whitney U test to see if the passed group were 
ranked significantly higher than the rejected group. 


Results 


The criteria measures (ranking of the stu- 
dents by the faculty) were accepted at face 
value as being appropriate measures of suc- 
cess. Three judges were used for ranking the 
students in academic success. Coefficients of 
concordance were computed and in every case 
the null hypothesis was rejected at the .01 
level of probability. Only two judges were 
qualified to make the postacademic rankings, 
so rank-order correlation was used. The co- 
efficients for Classes A and B were .95 and 


Richard P. Barthol and Barbara A. Kirk 


.87, respectively, both significant at the .01 
level. The rankings for each class were com- 
bined to form the criteria. The rankings of 
academic success were compared with the 
rankings of postacademic success, and cor- 
relations of .73 and .75 for Classes A and B 
were obtained, both significant at the .01 
level. It was felt that the measures were 
shown to be stable. 

Table 1 contains a summary of the impor- 
tant relationships found among the tests and 
the criteria, and in general indicates that 
Class B tended to conform to normal expec- 
tations: the students with the best back- 
ground, the highest previous achievement, 
and the highest level of mental ability were 
the better students and the better workers. 
This held true to a lesser extent for Class A. 
Class C, of small number, did not follow this 
pattern. Two students with the most aca- 
demic potential produced the least, because 
of emotional stress at this period, as reported 
by the staff. 

Table 2 shows the effectiveness of the four 
tests had they been used as screening devices. 
The Mann-Whitney U test was used to de- 
termine whether the group that would have 
been admitted had significantly higher rank- 
ings than the group that would have been re- 
jected. None of these hurdles significantly 
affected Class C. This was anticipated, since 
the same tests had already been used for se- 
lection, although with different standards. 


Table 1 


Rank-Order Correlations of Tests and Criteria 


Variables 
APHA and academic rank 
APHA+MMPI and academic rankt 
APHA and job performance 
APHA+MMPI and job performancet 
CM and academic rank 
CM and job performance 20 
CM and APHA 17 
Experience and job performance 20 


P< &. 
*P <0 


Class A 


Class B Class C 
p N p N p 
11 as 
9 .85** AS 
9 .70* 
6 80 
18 11 53° 
.00 9 32 
P 11 .89** 
Al* 11 .67* 


O01. 
+ The MMPI was used for screening. The correlation is between the APHA and the criterion. 





Selection of Graduate Students 


Table 2 


Significance of Difference of Rankings of Classes Divided into Two Groups by Single Test Results 


Test 
MMPI 
Strong 
Strong 
CM 
APHA exam 
MMPI+Strong+CM See Table 4 


* Mann-Whitney IU test 
* No scores below 55 
N.S. = not significant. 


Students with scores above the cutting score 
of 55 on the Concept Mastery were not 
ranked significantly higher than were the stu- 
dents with scores below 55. Scores ranged 
from 30 to 157. Students with scores above 
100 were usually found in the top third of 
the class, but three of the best students had 
scores well below 100. 

The MMPI was used to eliminate students 
who had any standard scores above 70, ex- 
cept on the Mf scale. The retained group 
was significantly better than the rejected 
group for Class A, but the differences were 
not significant for the other two classes. 

When the classes were divided by a clini- 
cal evaluation of the Strong, only in Class B 
was the retained group significantly better 
(.05) than the rejected group. 

Based on the analysis of the Strong com- 
bined profiles, male students were considered 
to be rejected with an OL score below 49 and 
an MF score above 55. (This was appro- 
priate for men only, since there is no OL 
score for women.) Table 2 indicates that 
only in Class A was the high OL-low MF 
group significantly better (.05) than the low 
OL-high MF group. Inspection of the data 
for the other two classes suggested that the 
hypothesis might have been supported in 
these classes had the N’s not been so very 
small (7 and 6, respectively). 

The analysis of the Strong indicated that 
many different profiles could be associated 
with success in the public health education 
graduate curriculum. Apparently most male 


Basis for Division 


Any score (except Mf) =70 01 


Raw score=55 


Significance* P< 


Class A 
N=20 


Class B 
N=11 


N.S 


Class C 


N=11 
NS. 


Clinical evaluation N.S. 05 N.S. 
OL=49 or MF=55 (men only 05 N.S. 


N.S. 


N.S N.S “ag 


Mean of APHA norm group O1 01 - 
01 01 N.S 


applicants for this particular School, whether 
they ultimately succeeded or failed in the pro- 
gram, tended to have high scores in Group V 
(welfare). On the other hand, apparently, 
they do not have interests similar to Group IT 
(the physical sciences). The only Strong 
group that seemed to be positively related 
to success was Group X (verbal-linguistic). 
Five of the best eight students scored A in 
Group X, while only one of the worst eight 
and one of the middle ten had A’s in Group X. 

No clear-cut pattern emerged from the 
Strong test for women. Although high scores 
on the Social Worker, Psychologist, and Law- 
yer scales seem to be appropriate, several of 
the lowest-rated women received high scores 
in these three areas. High scores in the busi- 
ness and domestic occupations, when not sup- 
ported by other interests, seemed to be inap- 
propriate, but the evidence was not strong. 
It is possible that scores above 55 on the FM 
scale for women would indicate an inappro- 
priate pattern, but the number of cases is too 
small to make any firm statement. 

Those test results that seemed most useful 
were abstracted. Tables 3 and 4 show the 
characteristics for each test that seemed to 
predict either success or failure in the public 
health education program. Also included are 
the student background characteristics that 
seemed to be related to success or failure. It 
is apparently much easier to use the tests for 
screening out than for predicting success; 
that is to say, it is easier to tell which stu- 
dents are likely to fail than which students 





Richard P. Barthol and Barbara 


Table 3 


Variables Indicating Success in Curriculum 


Sign 
A’s in Group X 
OL > 55, MF < 49 
Clinical evaluation 


Instrument 


Strong 


MMPI Nothing found 


Concept Mastery Scores above 100 


APHA examination Scores above mean of APHA 


norm group 


Academic preparation in 


, yublic health 

Background evaluation J , . . 

Work experience in public 
health 


are likely to be at the top of the class. When 
all of the negative principles were applied to 
Classes A and B, only six of the original 
twenty remained in Class A, and three of the 
original eleven in Class B. Four of the best 
five students were retained in Class A, and 
three of the best five were retained in Class 
B. Had the suggested criteria been used 
without reference to any other factors, the 


selected group would have been very much 
smaller, but significantly better than the re- 


jected group. In Class C, five middle-ranked 
students would have been rejected had the 
tests been used again with the additional 
negative weightings. Since none of the best 
students was dropped, the suggested negative 
criteria would probably be more rigorous than 
the former methods but would pass the best 
students. 

The value of the three tests as aids in se- 
lection was subjectively supported by mem- 
bers of the School faculty who stated that 
Class C was the best group of students they 
had had. They further stated approval of 
the selection that would have been made had 
the tests been used for all three classes. Ob- 
viously this subjective evidence is not criti- 
cal; the important fact is that it is not at 
variance with the statistical evidence. 


Discussion 


This study indicates once again that a test- 
ing program, even though carefully and logi- 





A. Kirk 


Table 4 


Variables Indicating Failure in Curriculum 


Instrument Sign 


Depressed protile 

High manipulative w/o other 
support 

(Men) WF above 55, OL be- 
low 49 


(Women) High domestic or 


Strong 


business w/o other support 


Spikes on Pd or Sc (above 65) 

Any score (except Wf) above 
70 

Clinical appraisal 


MMPI 


Concept Mastery Scores below 55 


APHA examination Scores below mean of APHA 


norm group 


No academic preparation in 
either biological sciences or 
Background evaluation public health 
No work experience in public 
health, medical or social 
service occupations 


cally designed, must undergo an empirical 
analysis. The Strong, usually considered a 
powerful selection device, was not shown to 
be of great value. There did not seem to be 
any scale or combination of scales that would 
pick out the interest pattern of successful 
students in this field; there were, however, 
patterns that seemed to be contraindicators. 
The MMPI was useful for negative screening 
but did not seem to be able to predict suc- 
cess. The Concept Mastery had a cutting 
score for elimination and also a higher score 
that was a good predictor for success. Scores 
in between the two had no predictive value. 
The APHA, which was only in the battery by 
chance, not only had a good cutting score but 
also ranked the students in the approximate 
order of later success. 

Post facto reasoning indicates that since 
the APHA Examination is an achievement 
test, most students with inadequate back- 
ground in public health education would tend 
to make low scores. This could predict fail- 
ure because either the background is quite 
important or applicants with long-term inter- 





Selection of Graduate Students 


ests in the field make the best students. On 
the other hand, if a student had the proper 
background but still made a low score, it 
might indicate a lack of ability to apply him- 
self. Superior ability of this sort might be 
indicated when a student without much back- 
ground scores well on the APHA. 

The other three tests in combination worked 
well in eliminating poor students but lacked 
discrimination, since they also screened out 
some of the very good students. If there are 
many more applications than openings, this is 
not important since good students would be 
rejected anyway. If there is enough space to 


absorb all good applicants, then harm is done 
both to the student and to the program. 


This 


163 


problem has not yet been resolved, since we 
have not been able to find the characteristics 
that distinguish the good students with nega- 
tive test results from the poor students with 
negative test results. 

Except for two low-achieving students in 
Class C, all who survived the tests did well. 
The two students were investigated; they 
both had unanticipated medical problems 
which interfered with their success. One of 
the highest students in Class A had negative 
indicators on all of the tests. Because of in- 
sufficient clinical information, we have no hy- 
potheses about why this occurred. 


Received September 12, 1955. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


A Comparison of Successful and Unsuccessful Students in 
the Medical School at the University of Minnesota 


Vivian H. Hewer 


University of Minnesota 


The Problem 


Faculty and administrators of the Medical 
School at the University of Minnesota have 
become increasingly concerned with the num- 
ber of students dropped from the Medical 
School during recent years. Poor academic 
achievement and inadequate personal adjust- 
ment are two reasons cited for the loss. 

In an effort to reduce this loss, an attempt 
is being made to improve selection of stu- 
dents admitted to medical school. For a 
number of years, psychological tests including 
the Professional Aptitude Test, now known as 
the Medical College Admission Test, and the 
Minnesota Medical Aptitude Test, as well as 
other methods, were used to select students. 
A marked change was made in the kinds of 
psychological tests required of students seek- 
ing admission to the freshman class in the 
fall of 1954. Whereas formerly the emphasis 
in testing had been on medical aptitude, the 
decision was made at that time to evaluate 
two other psychological attributes, interests 
and personality adjustment, by tests. The 
Strong Vocational Interest Blank (SVIB) 
was selected to measure vocational interest, 
and the Minnesota Multiphasic Personality 
Inventory (MMPI) to measure personality 
adjustment. The Miller Analogies Test (Form 
H) was also added as a test of academic ca- 
pacity. The Minnesota Medical Aptitude 
Test was discontinued; and the Medical Col- 
lege Admissions Test was retained. 

There are scattered research data, but very 
few related to the efficiency of the newly se- 
lected tests to predict success or persistence 
of students in medical training. Studies by 
Dvorak (1), Melton (5), and Hewer (4) 
found that score on the Physician key did 
not contribute to prediction of success in pre- 
medical training. Schofield (6), using MMPI 
and equating the ability level, reported medi- 
cal students with superior grades had signifi- 


164 


cantly lower scores on some scales of the test 
than did those with inferior grades. Glaser 
(2) also studied the relation of scores on 
MMPI and Miller Analogies (Form G) to 
success in medical school. 

This research is concerned with the com- 
parison, by various methods, of the scores 
made on tests by a group of successful and a 
group of unsuccessful medical students. The 
purpose is to determine whether any of the 
tests, including the newly selected SVIB and 
MMPI, can be used to predict success in 
medical school. 


Method 
Sample 


The successful group was composed of men from 
two classes of medical students, those who entered 
in the fall of 1951 and who had successfully com- 
pleted two years of medical school (N = 115), and 
those who entered in the fall of 1952 and who had 
successfully completed one year of medical school 
(N=110). These data were taken from a paper 
by Smith (7), summarizing test performance of a 
group of successful medical students. The unsuc- 
cessful group (N = 29) was composed of all male 
students dropped from medical school because of 
scholastic failure during a five-year period, 1949 
1953. Those students who were re-admitted and 
were successful on a second trial in making a satis- 
factory average were excluded from the sample. All 
the available test data on both groups were secured 
from the files of the Student Counseling Bureau at 
the University of Minnesota. 


Procedure 


Three different approaches to the analysis of test 
results were used in this study. In the first, tests 
were applied to determine whether a significant or 
reliable difference exists between the mean _ scores 
made by the two groups on a variety of tests. The 
t test or d test, the Behrens-Fisher, was used, the 
latter when variance about the mean when tested by 
the F test was not found to be homogeneous. The 
variables tested in this part of the study were the 
following: 


1. High School Rank (HSR), the percentile rank 
of the student in his own high school class. 


2. Total Premedical Honor Point Ratio (Total 





Comparison of Successful and Unsuccessful Medical Students 


Table 1 


F Test and ¢ Test Comparisons on Mean Scores on Tests for Successful and Unsuccessful Medical Students 





Successful 


Test N 


. HSR 87.2 
Total Pre-Med HPR 2.2 
3. Req Sc Pre-Med HPR 22 
. ACE 747 126.6 
5. Coop Eng S 206.1 
. Prof Apt 
a. Verbal Ability 
b. Quant Abil 
c. Mod Soc 
d. Med Sc 
7. Minn Med Apt 
3. Phys Key 


Mean 


538.7 
579.6 
555.4 
556.0 
168.4 

43.0 


*=- 1 


* = 05. 
one = (01, 


Pre-Med HPR), which represents grades earned in 
all courses taken in premedical training.’ 

3. Required Science Premedical Honor Point Ratio 
(Req Sc Pre-Med HPR), which represents grades 
earned in all science courses required for entrance to 
medical school. 

4. American Council on Education Psychological 
Examination, 1947 form, (ACE °47), a college apti- 
tude test. 

5. Cooperative English Test-Form S, lower level 
(Coop. Eng S), a test of achievement in English. 

6. Professional Aptitude Test (PAT), now known 
as Medical College Admission Test. The four parts 
studied were verbal ability, quantitative ability, 
modern society, and medical science. 

7. Minnesota Medical Aptitude Test (Minn. Med 
Apt), only total score was analyzed. 

8. Strong Vocational Interest Blank (SVIB), only 
score on the Physician key was used. 

9. Minnesota Multiphasic Personality 
(MMPI). 

The second analysis was the application of the 
chi-square test to the frequencies within the two 
groups of the occurrence of patternings of elevation 
of MMPI scales. Hathaway (3) suggests a rather 
elaborate system of coding which takes into account 
the degree of elevation of MMPI scales. A very 
simple approach was used in this study. Code num- 
bers were attached to the two scales with the high- 
est scores regardless of degree of elevation. For ex- 
ample, a profile with Pd and Ma the highest points 
was coded 49. The codes were then tallied, tallying 
each scale, rather than combination of two scales. 
It was to these frequencies that the chi-square test 
was applied. 

1 Honor point ratio is figured with A=3, B= 2, 
C=1, D=0, F=0 honor points. 


Inventory 


Unsuccessful 


N Mean 
25 81.7 
29 1.9 
29 1.8 
26 117.9 
27 198.2 


26 


25 
26 
27 
19 


=a ou yt 


In the third and final analysis, SVIB and MMPI 
profiles were prepared on the successful and unsuc- 
cessful medical students although these data were not 
available for the total sample. Three judges? were 
asked to sort SVIB profiles into two groups, indi- 
cating those whom they would and would not rec- 
ommend for acceptance in medical school on basis 
of interest measurement. MMPI profiles were sorted 
by two clinical psychologists? into three groups— 
accept, reject, and hold. Again with regard to rec- 
ommendation for medical school, chi-square tests 
were then applied to determine whether judges could 
identify the successful and unsuccessful students 
through a blind sort of test profiles 


Results 


Results of ¢ tests on the first eight variables 
studied are presented in Table 1. Differences 
between total premedical honor point ratio of 
successful and unsuccessful medical students 
are significant at the .01 level. Difference 
between mean scores of the successful and 
unsuccessful students on the Minnesota Medi- 
cal Aptitude Test is also significant at the .01 
level and on the ACE at the .05 level. 

Difference in mean score for the two groups 
on each scale of MMPI were tested and L- 
scale score is the only one which differentiates 


2 The writer is very grateful to Dr. Ralph Berdie, 
Dr. Theda Hagenah, and Dr. Wilbur Layton for 
their assistance in this part of the study. 

8 The writer is very grateful to Dr. Starke Hath- 
away and Dr. William Schofield for their assistance 
in this part of the study. 





166 


the two groups at the .05 level. It should be 
added that in this analysis d tests were ap- 
plied to scores on the L (lie), D (depression) 
and Pd (psychopathic deviate) scales because 
F tests indicated lack of homogeneity of vari- 
ance. 

Casual inspection of the MMPI profiles 
created the impression that those of the un- 
successful students were more deviate than 
those of successful students. The above 
analysis in which scores of the two groups on 
specific scales were compared revealed no sig- 
nificant differences except on the LZ scale. A 
second approach to check this impression 
further was to code the MMPI patterns in 
the manner suggested under procedure. An 
inspection of the percentage distributions of 
frequency of the scales suggested, for exam- 
ple, that a statistically higher percentage of 
the unsuccessful students might have elevated 
Pd scores than did the successful students. 
The reverse appeared to be true of the Ma 
scale. The results of the chi-square test ap- 
plied to check this hypothesis are not con- 
clusive, probability occurring between the .05 
and .10 levels. There is, however, at least a 
suggestion that unsuccessful medical students 
may differ from successful students in per- 
sonality organization as measured on MMPI. 

In the third analysis, the blind sort of 
SVIB profiles, there was a high degree of 


Table 2 


Chi-Square Tests of the Agreement Between Judges’ 
Ratings of SVIB Profiles and Criterion 








Criterion 
Success- 
Judges ful 


Unsuc- 
cessful 





Successful 51 16 
Unsuccessful 54 3 


JudgelI 


Successful 64 


Judge I Unsuccessful 41 


Successful 60 


Judge Ii Unsuccessful 45 


Successful 50 


Agree Unsuccessful 37 





Vivian H. Hewer 


agreement among the three judges. Judge I 
agreed with Judge II 89 per cent of the time 
and with Judge III 91 per cent of the time; 
Judges II and III agreed 89 per cent of the 
time, and all three judges agreed 85 per cent 
of the time. It was not possible, however, 
for the judges to identify medical students 
who failed by inspection of their SVIB pro- 
files. Of the 19 students who failed, the 
three judges agreed that 15, or 79 per cent, 
of these should have been accepted in medi- 
cal school. They agreed to reject 37 or 35 
per cent of the successful students. Table 2 
is a report of chi-square tests of the agree- 
ment between judges’ ratings of SVIB pro- 
files and the criterion. 

Two of the chi-square tests are significant 
at the .05 level, and one at the .01 level. It 
will be noted, however, that the results are in 
a direction opposite from what might be ex- 
pected. In other words, not only were the 
judges not able to identify medical students 
who failed from a rating of their SVIB pro- 
files, but they agreed in labeling as unsuc- 
cessful those who succeeded and as success- 
ful those who failed (.05 level). 

The two judges who sorted the MMPI pro- 
files agreed 82 per cent of the time. Here 
again it was impossible to identify the un- 
successful medical students. Of the 17 un- 
successful students, both judges agreed they 
would recommend acceptance of 11, or 65 
per cent of them. These judges would not, 
however, reject many of the successful stu- 
dents, nor, for that matter, of the unsuccess- 
ful. Chi-square tests applied to these data 
are presented in Table 3. In the fourth test, 
those cases which the judges agreed should 
be rated “hold” and “reject” were combined 
in the unsuccessful group. There is no indi- 
cation that unsuccessful medical students can 
be identified from their MMPI profiles, since 
all chi-square tests indicate a chance ranking 
of the profiles when compared to the cri- 
terion. 

One other comparison was made, a check 
to see how many students would be rejected 
on the basis of both MMPI and SVIB. In 
only one case, and he was a successful stu- 
dent, would the individual have been re- 
jected on both SVIB and MMPI by all five 
judges. 





Comparison of Successful and 


Discussion 


Successful medical students have signifi- 
cantly higher premedical honor point ratios, 
both when grades in all premedical courses 
and when grades only in required science 
courses are considered, than do unsuccessful 
medical students. This result gives further 
evidence for continuing the practice of em- 
phasis on premedical grades for selection. 
The analysis of scholastic aptitude tests sug- 
gests that successful medical students as a 
group have a significantly (.05 level) higher 
mean score on ACE than do unsuccessful stu- 
dents. The successful medics made signifi- 
cantly higher scores on the Minnesota Medi- 
cal Aptitude Test, but score on PAT did not 
differentiate the two groups. 

The significantly higher Z score on the 
MMPI for unsuccessful medical students sug- 
gests a higher degree of defensiveness among 
them in responding to MMPI items which 
may have obscured basic differences in the 
total profile. Practicing counselors have also 
suggested the L scale may have clinical value 
in describing unsophisticated persons with 
poor self-understanding and insight. Inter- 
estingly, Schofield (6) found a group of low- 
achieving medical students had significantly 
higher LZ scores than did a group of high 
achieving students at the same ability level. 
The similarity of these findings may stimu- 
late further research in this area. It does ap- 
pear, however, that the Z score may serve as 
an indicator of unrealistic attitudes toward 
self which in some way is related to aca- 
demic achievement. 

A test applied to the distribution of pat- 
ternings of MMPI scores gave inconclusive 
evidence of differences in personality pattern- 
ings between the two groups. The unsuc- 
cessful students seemed to have a dispropor- 
tionate number of high Pd scores, suggesting 
individuals of low social concern, poorly or- 
ganized goals, and general immaturity. This 
hypothesis, however, is in need of further 
check. 

Experienced psychologists were unable to 
identify successful and unsuccessful medical 
students through the use of either of SVIB or 
MMPI profiles. This, perhaps, is not too 
surprising, as it will be recalled that the un- 


Unsuccessful Medical Students 


Table 3 

Chi-Square Tests of the Agreement Between Judges’ 
Ratings of MMPI Profiles and Criterion 

Criterion 

Success- Unsuc- 

Judges ful cessful 

78 

Unsuccessful 6 


Successful 
Judge I 


12 
1 


74 
10 


Successful 12 


Judge IT : : 

Unsuccessful 
o8 
Unsuccessful 5 


Successful 
Agree 


Agree Successful 
(Unsuccessful 
and Hold) 


Unsuccessful 


successful students were those dropped for 
academic failure. These tests may serve to 
predict later adjustment to the profession, 
an hypothesis that could be checked only 
through follow-up. Medical school adminis- 
trators and faculty are concerned with select- 
ing students who not only can meet academic 
requirements in training, but also who can be 
successful in practicing medicine. 

Some students fail in medical school for 
reasons other than scholastic reasons—lack of 
interest, emotional instability, or difficulty 
with interpersonal relationships. It would be 
expecting too much to have tests identify 
those who failed both for academic and other 
reasons. Further study of the latter group 
is required. 

Conclusions 


This study is a comparison of scores made 
on a group of tests by successful (NV = 225) 
and unsuccessful (V = 29) medical students. 
It was hoped that the results of the study 
would give some assistance in evaluating cur- 
rent medical school procedures in selection. 

The following results were found: 

1. Successful medical students make sig- 
nificantly higher grades (.01 level) in their 
premedical courses than do unsuccessful medi- 
cal students. This is true not only when the 
total honor point ratio is considered, but also 
when honor point ratio in science courses re- 
quired for medical school is considered. 





168 Vivian H. Hewer 


2. Successful medical students make sig- 
nificantly higher scores (.01 level) on the 
Minnesota Medical Aptitude Test and on the 
ACE (.05 level) than do unsuccessful stu- 
dents. 

3. Unsuccessful medical students had a sig- 
nificantly higher score (.05 level) on the L 
scale of MMPI. This suggests a defensive- 
ness in responding to MMPI items or, pos- 
sibly, low psychological maturity. 

4. There is a suggestion (chi-square .10 > 
P > .05) that the general personality organi- 
zation, as measured on MMPI, may be differ- 
ent for successful and unsuccessful students. 

5. Experienced psychologists were unable 
to identify successful and unsuccessful medi- 
cal students through the use of either SVIB 
or MMPI profiles. 


Received August 17, 1955. 


References 


. Dvorak, Beatrice. Adjustment of pre-medical 
freshmen to the university. Unpublished mas- 
ter’s thesis, Univer. of Minnesota, 1930. 

. Glaser, R. Predicting achievement in medical 
school. J. appl. Psychol., 1951, 35, 272-275. 

. Hathaway, S. R., & Meehl, P. E. An atlas for 
the clinical use of the MMPI. Minneapolis: 
Univer. of Minnesota Press, 1951. 

. Hewer, Vivian H. Vocational interest-achieve- 

ment-ability inter-relationships at the college 
level. Unpublished doctor’s thesis, Univer. of 
Minnesota, 1954. 
‘on, R. Prediction of success of pre-medical 
sreshmen at the University of Minnesota. 
Unpublished master’s thesis, Univer. of Min- 
nesota, 1951. 

. Schofield, W. A study of medical students with 
the MMPI: III. Personality and academic 
success. J. appl. Psychol., 1953, 37, 47-52. 

. Smith, Joyce S. Summary of test performance 
of medical students. Unpublished master’s 
paper, Univer. of Minnesota, 1954. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


The Development and Standardization of a Preliminary 
Form of an Activity Experience Inventory: A 
Measure of Manifest Interest * 


Wm. Price Ewens 


Agricultural and Mechanical College of Texas 


Studies of interests and methods of collect- 
ing interest data have led to various clas- 
sifications of interests (2, 6, 9). Expressed 
interests, manifest interests, inventories inter- 
ests, and tested interests are categories found 
in these sources. Although recognized as an 
interest type, manifest interest has received 
little consideration by researchers concerned 
with interest measurement. Tests initially 
considered measures of manifest interest (4, 
7, 8) have since been classified as measures 
of tested interest (6). 

Manifest interest has been defined as being 
“synonymous with participation in an activity 
or an occupation” (6). In apparent agree- 
ment with this definition, Travers (9) states 
that “Manifest interests are determined by 
observing what the individual does in his 
spare time or perhaps at work.” The mag- 
nitude of the task and the difficulty of deter- 
mining manifest interest by observation for 
any number of subjects becomes immediately 
apparent, but it seems likely that manifest 
interest might be determined through the use 
of a self-report inventory. 

Dressel and Matteson (1) used the Kuder 
Preference Record items to measure experi- 
ence by changing directions for the instru- 
ment. They concluded that this was an un- 
satisfactory method of measuring experience 
and expressed the opinion that it would be 
necessary to develop an instrument specifi- 
cally designed to measure experience. 


Definition of Problem 


The study on which this report is based 
had two major purposes. The first was to 


1This study was conducted in the Department of 
Education of Stanford University under'the direction 
of Dr. H. B. McDaniel. A more complete presenta- 
tion of results may be found in the original study, 
“Experience patterns as related to vocational pref- 
erence.” Unpublished doctor’s dissertation, 1949, 
Stanford University Library, Stanford, California. 


develop and establish normative data for an 
experience inventory and the second * was to 
examine the relationships between experience, 
as measured by the instrument, and prefer- 
ence as measured by the Kuder Preference 
Record. This report will be concerned with 
the first of the above listed purposes. 

The problem of developing an experience 
inventory was further defined to give direc- 
tion in item selection and in inventory de- 
sign. The inventory was (a) to measure ac- 
tivity experience in the interest areas used by 
the Kuder Preference Record, Form BB; (0) 
to be objectively scored and give a composite 
experience score for each of the interest areas; 
(c) to include activity items within the prob- 
able experience of boys and girls of high 
school age; (d) to use vocabulary of high 
school level; (e) to include directions suffi- 
ciently specific to permit self-administration 
if necessary; and (f) to be administered and 
scored within fifty minutes. 


Procedure 


Development of the Activity Experience In- 
ventory 


From approximately 2,000 activity items written 
by graduate students in counselor training at Stan- 
ford University a trial-form inventory was developed 
containing twenty-five activity items for each of the 
nine areas of the Kuder Preference Record. A five- 
step scale was to be used by the subjects in response 
to each item of the inventory. The five steps were 
arbitrarily assigned weights varying from 0 through 
4, with O representing no experience and 4 the 
weighting for maximum experience. 

A group of counselor trainees made constructive 
criticisms with regard to definitions of response cate- 
gories, statements, and appropriateness of activity 
items, administrability, vocabulary, and on the an- 
swer sheet form. It was administered to three sec- 
tions of general psychology and to students in a 


2 “Experience 
preference” 
1956). 


vocational 
Measmt, 


patterns as related to 
(to appear, Educ. psychol. 


169 





170 


measurements and evaluation class in a California 
college, and finally, to a number of tenth-grade stu- 
dents in a California high school to get further 
evaluation of the inventory relative to the criteria 
being used for its construction. 


Statistical Analysis 


For statistical analysis of the Activity Experience 
Inventory and for tentative standardization, data 
were collected from students in three California high 
schools. The age and sex composition of the stand- 
ardization group is given in Table 1. Statistical 
analysis of the inventory included examination for 
validity, reliability and intercorrelation and the es- 
tablishment of normative data. 


Validity 


The following are some considerations that directly 
reflect on the validity of any attempt to measure ac- 
tivity experience. 

1. Activity items selected for an inventory at best 
represent an attempt to sample the experience back- 
ground of the individual being measured. 

2. The tendency of a subject to underestimate or 
overestimate, whether conscious or unconscious, is a 
general criticism of rating-scale techniques and will 
be reflected in the validity of this inventory. 

3. The psychological factor of recency might be 
expected to influence the amount of experience indi- 
cated in a particular activity and thus the validity 
of the measure. 

4. The amount of experience in an activity is not 
a direct function of time spent in the activity, but 
would seem to be dependent upon attentiveness, in- 
telligence, related experience, and possibly other 
factors. 

Recognizing these problems, the inventory was ex- 
amined for content and construct validity. Content 
validity was tentatively established by graduate 
counselor training students in the developmental 


Table 1 


Age and Sex Composition of the Sample Used in 
Standardizing the Activity Inventory 


C 





Age ) Female Total 





15 : 14 19 
16 114 204 
17 232 

18 73 151 
19 : 17 
20 2 


Total 


Mean 


Wm. Price Ewens 


stage as they judged the appropriateness of items 
for measuring experience in the several interest areas. 
Criterion groups were not available for determining 
predictive and concurrent validity, but two analyses 
were made to examine the inventory for construct 
validity. This, essentially, is an attempt to validate 
the theory underlying the inventory. 

In the absence of other instruments giving a 
quantitative measure of experience, an Experience 
Data Blank was developed on which, by student re- 
sponse and by survey of school records, data were 
accumulated in each of the following areas: 


1. employment (described to indicate nature of 
job), 

2. hobby and leisure-time activities, 

3. out-of-class school activities, 

4. home duties and unpaid work experience, 

5. courses taken in school. 


Experience data collected from the data blank 
were examined and each item listed by the student 
under the above categories was evaluated by the 
writer relative to the interest areas of the Activity 
Experience Inventory. If a particular experience 
listed by the student related to more than one of 
the interest areas, it was credited accordingly. !f an 
activity could not logically be classified in any of 
the interest areas, it was omitted from the summa- 
rization. As a next step in analyzing the data, the 
blanks were sorted into stacks representing varying 
amounts of experience for an interest area and as- 
signed rank numbers representing the variation from 
a large amount to a small amount of experience. 
This process of sorting and ranking was repeated for 
each of the interest areas. The rankings of experi- 
ence from the Experience Data Blanks were cor- 
related with scores from the Activity Experience In- 
ventory as indicated in Table 2. 


Table 2 


Correlation of Activity Experience Inventory Scores for 
High School Students Against an Independent 
Measure of Experience 


Validity 
Coefficients 


Interest Areas (N=76) 


Mechanical 82 
Computational 39 
Scientific 33 
Persuasive 45 
Artistic 

Literary 

Musical 

Social Service 

Clerical 


Median 








Preliminary Form of an Activity Experience Inventory 


Table 3 


Means and Standard Deviations for Male and Female Experience Scores; 
Critical Ratios for Significance of Differences in Means 








Male 


Mean 
45.60 
28.62 
29.86 
24.85 
21.02 
21.06 
22.66 
27.45 
23.46 


Interest Areas 


Mechanical 
Computational 
Scientific 
Persuasive 
Artistic 
Literary 
Musical 

Social Service 
Clerical 


13.66 
14.20 
14.82 
12.55 
12.15 
18.88 
13.28 
13.60 


Female 
(N = 438) 





* Mean score for male greater than mean score for female. 


Results 


Examination of Table 2 shows validity co- 
efficients varying from .27 for the social serv- 
ice area to .82 for mechanical experience with 
a median coefficient of .39. A study of the 
data on the experience blanks partially ex- 
plains this range of coefficients. Work ex- 
perience and hobbies of a mechanical nature 
listed by the students on the blanks were 
quite objective and varied. The musical and 
clerical areas also yielded a listing of experi- 
ence information that was relatively easy to 
classify. These areas show high validity co- 
efficients. The definiteness of mechanical, 
musical, and clerical experiences can be con- 
trasted with the indefiniteness of the social 
service area. High school courses and other 
experiences are less easily categorized as so- 
cial service in nature, resulting in a lower 
validity coefficient. 

Additional validity evidence is found in 
the examination of mean experience scores 
for boys and girls given in Table 3. Mean 
experience scores for males were found to be 
significantly greater than scores for females 
in the mechanical, computational, and scien- 
tific areas, with mean experience scores in 
the persuasive activities not being signifi- 
cantly different. Females had significantly 
greater mean experience scores than males in 
the artistic, literary, musical, social service, 
and clerical areas. 


The above data showing the relationship 
between experience scores obtained from the 
Activity Experience Inventory and_ the 
amount of experience obtained by an inde- 
pendent and quite different technique, the 
Experience Data Blank, and the mean ex- 
perience scores for males and females are 
suggestive of construct validity of the in- 
strument. 


Reliability 


Reliability coefficients, when correlating 
odd vs. even items of the inventory (Table 
4), for a sample of 398 junior and senior 
high school males varied from .87 to .94 
with a mean of .90. Five of the interest 
areas—mechanical, computational, persuasive, 
musical, and social service—gave coefficients 
equal to or greater than .90. For a sample 
of 438 junior and senior high school females, 
similar coefficients varied from .82 to .92, 
with a mean reliability coefficient of .89. 
The persuasive, artistic, musical, and social 
service areas gave coefficients greater than 
90. 

Of the seniors who marked the activity in- 
ventory, 31 males and 35 females were avail- 
able for again marking the inventory in a six- 
month follow-up. The interval of six months 
between the first and second administration 
of the inventory spanned a summer, and the 
subjects had enrolled in college near the end 





172 


of this period. Test-retest reliability coeffi- 
cients from Table 4 ranged for males from 
.75 to .91, with a mean of .83, and for fe- 
males from .60 to .79, with a mean of .73. 

To examine the inventory for stability of 
profiles, experience scores for 31 males and 
35 females were converted to standard scores, 
ranked in order of these scores, and rho co- 
efficients of the paired rankings determined. 
These coefticients represent the degree of simi- 
larity of experience profiles resulting from 
two administrations of the inventory with a 
time interval of six months. The rho coeffi- 
cients for males ranged from .30 to .98, with 
a median of .82. For the 35 females the rho 
coefficients ranged from .45 to .93, with a 
median of .77. 


Intercorrelations 


Intercorrelations calculated from a 20 per 
cent sample of the original data are given in 
Table 5. The coefficients for the areas of 
the inventory ranged for males from .09 to 
.81, with a median coefficient of .54. For fe- 
males the coefficients ranged from .29 to .75, 
with a median of .53. The correlations be- 
tween areas are approximately the same for 
males and females with the exception of three 
coefficients. The musical area, when corre- 
lated with the mechanical, computational, 


Wm. Price Ewens 


and scientific areas, gave somewhat larger 
coefficients for females than for males. 

These intercorrelations seem large for effi- 
cient testing but may be justified for an in- 
strument of this nature. Since the experi- 
ences of youth tend to encompass all the in- 
terest categories, one would expect positive 
intercorrelations for areas of the inventory. 

Correlations between areas of the Activity 
Experience Inventory are considerably greater 
than for areas of the Kuder Preference Rec- 
ord (3) for which item selection and instru- 
ment design assured low intercorrelations. 
The intercorrelation coefficients for the areas 
of the inventory are about the same magni- 
tude as the coefficients between areas of the 
Strong Vocational Interest Blank (5). 


Normative Data 

The age-grade characteristics of the stand- 
ardization group and the mean experience 
scores and standard deviations for males and 
females are given in Tables 1 and 3, respec- 
tively. The male and female mean experi- 
ence scores were sufficiently different to make 
separate norms necessary. Percentile norms 
developed for each interest area were ar- 
ranged in table form to facilitate construc- 
tion of individual profile charts from raw 
scores. 


Table 4 


Reliability Coefficients Obtained by Correlating Odd vs. Even Items and from Test-Retest Scores 





Odd vs. Even Items (ri:) 


Male 
(NV =398) 


94 
.90 
88 
92° 


Interest Areas 





Mechanical 
Computational 
Scientific 
Persuasive 
Artistic 
Literary 
Musical 

Social Service 
Clerical 


89 
91 
91 
87 


Mean 


Test-retest (r+) 


Male 
(N=31) 


Female 
(N =438) 


Female 
(N=35) 


87 
82 
.86 
.92 
.92 
.88 


84 
85 
83 
89 
78 
82 
91 
75 
76 


ao 
70 
79 
76 
.69 
Be | 
72 
79 
.60 


92 
88 
89 


83 73 








Preliminary Form of an Activity Experience Inventory 


Table 5 


Intercorrelations for Areas of the Activity Experience Inventory for Both Male and Female 


N (Male) = 79; 


Interest Areas 
. Mechanical 


. Computational 
. Scientific 
. Persuasive 
. Artistic 
. Literary 
. Musical 
8. Social Service 


. Clerical 


Summary 


1. This report describes the development of 
the Activity Experience Inventory, a measure 
of manifest interest, using the interest areas 
of the Kuder Preference Record, Form BB, 
as framework for the inventory. 

2. The experience inventory was designed 
to be objectively scored, to give a composite 
score of experience for each of the interest 
areas, to contain activity items within the 
probable experience of boys and girls of high 
school age, to use vocabulary easily under- 
stood by high school age groups, to contain 
directions to permit self-administration and 
scoring, and to be administered and scored 
in about fifty minutes. 

3. Using the Activity Experience Inventory 
and the Experience Data Blank, validity co- 
efficients for the inventory varied from .27 to 
82, with a median of .39. A study of mean 
experience patterns for boys and girls gave 
further evidence of validity. Males had sig- 


N (Female) = 89 


3 4 








nificantly greater mean experience scores than 
females in the mechanical, computational, and 
scientific areas. Females had significantly 
greater mean experience scores in the artistic, 
literary, musical, social service, and clerical 
areas. There was no significant difference in 
mean persuasive scores. 

4. Odd-even reliability coefficients for the 
Activity Experience Inventory varied for 
males from .87 to .94, with a mean of .90, 
and for females from .82 to .92, with a mean 
of .89. Test-retest reliability coefficients with 
a six-month time interval varied for males 
from .75 to .91, with a mean of .83, and for 
females from .60 to .79, with a mean of .73. 

Rho coefficients were calculated from test- 
retest experience profiles as an indication of 
profile stability. With a time lapse of six 
months between the two administrations of 
the inventory, the coefficients ranged for 
males from .30 to .98, with a median of .82. 





174 Wm. Price Ewens 


Rho coefficients for females ranged from .45 
to .93, with a median of .77. 

5. Intercorrelations for areas of the inven- 
tory ranged for males from .09 to .81, with 
a median coefficient of .54. For females the 
coefficients ranged from .29 to .75, with a 
median of .53. 

6. Tentative norms, separate for males and 
females, were established using a sample of 
398 males and 438 females who were juniors 
and seniors in three California high schools. 


Received August 26, 1955. 


References 


1. Dressel, P. L., & Matteson, R. W. The relation- 
ship between experience and interest as meas- 
ured by the Kuder Preference Record. Educ. 
psychol. Measmt, 1952, 12, 109-116. 


. Fryer, D. The measurement of interests in rela- 


tion to human adjustment. New York: Holt, 
1931. Pp. 1-363. 


. Kuder, G. F. Revised Manual for the Kuder 


Preference Record. Chicago: Science Re- 
search Associates, 1946. Pp. 3-30. 


. Older, H. J. An objective test of vocational in- 


terest. J. appl. Psychol., 1944, 28, 99-108. 


. Strong, E. K. Vocational interests of men and 


women. Stanford, Calif.: Stanford Univer. 
Press, 1943. Pp. 1-726. 


. Super, D. E. Appraising vocational fitness. New 


York: Harper, 1949. Pp. 1-642. 


. Super, D. E., & Haddad, W. C. The effect of 


familiarity with an occupational field on a 
recognition test of vocational interest. J. 
educ. Psychol., 1943, 34, 103-109. 


. Super, D. E., & Roper, Sylvia A. An objective 


technique for testing vocational interests. J. 
appl. Psychol., 1941, 25, 487-498. 


. Travers, R. M. W. Educational measurement. 


New York: Macmillan, 1955. Pp. 3-407. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


Fakability of the Gordon Personal Profile 


Jay T. Rusmore 


San José State College 


The Gordon Personal Profile (1) is a per- 
sonality test developed by the forced-choice 
technique. While it is not claimed that the 
forced-choice approach renders the Profile 
“fakeproof,” it is stated that “. . . it prob- 
ably is less subject to ‘faking’ than inventory- 
type instruments” (1, p. 10). The present 
study was undertaken to test this statement. 

Longstaff and Jurgensen (2) administered 
the Jurgensen Classification Inventory, a 
forced-choice instrument, to 68 students un- 
der two sets of directions. The first was to 
represent an industrial selection situation, 
and the second a vocational guidance situa- 
tion. Both administrations were scored on a 
Self-Confidence key, and the differences in 
mean scores between the two administrations 
were found not to be statistically significant. 
However, the correlation coefficient between 
scores in the two administrations was .50, 
which the authors interpret as being “not en- 
couraging.” 

Their interpretation of their data brings 
them to the position that other techniques 
must be devised if the problem of malinger- 
ing on personality tests is to be overcome. 
While the present writer would concur with 
a position that no instrument is likely to be 
proof against malingering, his experience with 
the forced-choice test has led him to the po- 
sition that the prospects for the forced-choice 
personality test are somewhat brighter than 
has been previously stated. 

In the present study, to determine the fak- 
ability of the Gordon Personal Profile, the 
experimental situation developed by Long- 
staff and Jurgensen (2, p. 88) was used. 
While the Classification Inventory and the 
Personal Profile both employ forced-choice 
format, they differ in the manner in which 
they were developed. Items were included in 
the Personal Profile on the basis of differ- 
ential discriminating ability as determined by 
factorial composition, while no such selection 
was made for the Classification Inventory 


(2). Furthermore, pairs of items in the Per- 
sonal Profile were prematched on the basis of 
both equality of preference value and differ- 
ential discriminating ability, while items in 
the Classification Inventory were equated 
only on preference value. Finally, all items 
in the Personal Profile are scored on estab- 
lished keys, while the Classification Inven- 
tory has keys developed empirically for par- 
ticular situations utilizing only some of the 
items. In view of these differences, it is con- 
ceivable that the two tests may differ in 
terms of fakability. 


Procedure 


A group of 81 lower division students were given 
the Gordon Personal Profile twice, each time with 
different instructions. 

The first administration was in a simulated indus- 
trial situation. Directions were: “In taking this test 
make the following assumptions. You have just fin- 
ished your college work and are in the employment 
department of the organization you hope to work 
for, applying for a job. This job you are applying 
for is exactly the kind of job you want so it is very 
important to you that you get it. The personnel 
manager informs you that the company has a bat- 
tery of tests they give all their applicants and says, 
‘This is the first test in the battery. It is called the 
Gordon Personal Profile. You will please read the 
directions and then answer the questions.’ ” 

The second administration was in a simulated 
guidance situation. Directions were: “At the last 
meeting of the class you took the Gordon Personal 
Profile assuming you were applying for a job. To- 
day I would like to have you take the test again, 
making the following assumptions: You are having 
a great deal of trouble trying to decide what voca- 
tion you should go into. You finally decide to go 
to the Student Counseling Bureau to see if they can 
give you any assistance. The counselor informs you, 
‘We have a battery of tests we should like to have 
you take. We have found the results very helpful 
in dealing with problems like your own. The first 
test in the battery is called the Gordon Personal 
Profile. Will you please read the directions and 
then answer the questions?’ ” 


Results 


The scales of the Gordon Personal Profile 
are: Ascendancy (A), Responsibility (R), 


175 





Jay T. Rusmore 


Table 1 


Mean Raw Scores and Standard Deviations on Each Scale of the Gordon Personal Profile, 
Administered Under Two Different Sets of Instructions 


(N = 81 San José State College Psychology Students) 








A 
Situation, According 
to Instructions 


Mean SD 


Mean 





Industrial 44 5.5 8.9 
Vocational 3.7 6.2 74 
Difference a 1.5 
Significance of difference (¢) 1.4 FF pag 


Scales 


Mean 


SO SB: J . 26.1 13.9 
7.3 5 ’ ; 23.0 16.6 

a . 3.1 
1.3 j 2.0* 





* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


Emotional Stability (E), Sociability (S), and 
Total, or over-all self-evaluation (T). Mean 
scores for each scale for both simulated situa- 
tions are given in Table 1. 

It may be noted that for each scale the dif- 
ference is in favor of the “better” score for 
the simulated industrial situation. The ¢ test 
for the significance of these differences indi- 
cates that only in the case of R and T are 
these differences significant; R at the 1 per 
cent level of confidence, T at the 5 per cent 
level. 

The Profile was equally reliable under both 
sets of directions. These reliability coeffi- 
cients for “industrial” and “vocational” situa- 
tions, respectively, by scales are: A, .87 and 
89; R, .86 and .88; E, .81 and .70; S, .87 
and .92; T, .93 and .94. 


Table 2 


Relation Between Raw Scores Made on Each Scale of 
The Gordon Personal Profile, Administered Under 
Two Different Sets of Instructions 


(N = 81 San José’State College Psychology Students) 








Scales 





E 





(Industrial) (Vocational) .7 ; .67 


(Industrial) (Vocational) —.7 3 89 
Corrected for attenuation 
in both variables 





Correlations between each of the scales as 
administered in the two situations are given 
in Table 2. Total score correlational infor- 
mation is also presented. 

It will be seen that the correlation between 
scores for each trait in each of the two situa- 
tions is substantial. For those who may be 
interested, the coefficients are also corrected 
to show the theoretically true relationship less 
certain attenuation due to unreliability of the 
measures. 

It will be seen that the correlations be- 
tween the scores for each trait in the two 
situations range from .64 to .79; the value 
for the Total score is .59. This is a depar- 
ture from what may ordinarily be expected 
from a summary score. In a personal com- 
munication, the test author proposes that, for 
the present data, unit increases in score on 
any trait are smaller, in standard score terms, 
than the sum of these units, in standard score 
terms, for the Total score. Under this con- 
dition, the test-retest correlations may be less 
disturbed by introduced variance for the 
traits than for the Total. An empirical test 
of this has been undertaken by the test au- 
thor. His experience is réported in the fol- 
lowing paragraph: 


I tried this out on my own data where, with 121 
subjects, I obtained test-retest correlations of .798, 
.678, .739 and .782 for ARES and .793 for T.... 
I added an additional 20 phony characters with 
scores of — 5 on each trait for the pretest and scores 
of +5 on each trait of the posttest. The artificial r 





Fakability of the Gordon Personal Profile 


for Responsibility, which was the lowest, dropped to 
.620 while that for the Total dropped to .602, which 
is lower than the lowest trait score.1 


Conclusions 


1. In general, individuals have a slight 
tendency to show themselves to better advan- 
tage in the simulated industrial selection situa- 
tion than in the simulated vocational guidance 
situation. The difference between the means 
of the Total score for the two administrations 
is statistically significant at the 5 per cent 
level, but this difference is not of great prac- 
tical significance, being equivalent to an in- 
crease of about 8 percentile points. 

2. Of the four scales in the test, Responsi- 
bility shows a significant difference in favor 
of the simulated industrial selection adminis- 
tration at the 1 per cent level of confidence. 
This mean increase is equivalent to about 
9 percentile points. The difference between 
mean scores on the two administrations for 

1 
195 


L. V. Gordon, Personal communication. June 14, 
5. 


Ascendancy, Emotional Stability, and Socia- 
bility scales is not statistically significant. 

3. The correlation coefficients for the four 
traits between the scores on the simulated in- 
dustrial selection and vocational guidance ad- 
ministrations are substantial. This indicates 
that the subjects did not change their re- 
sponses substantially from one set of direc- 
tions to the other. 

4. Present results support the contention 
that the Gordon Personal Profile “. . . prob- 
ably is less subject to ‘faking’ than inventory- 
type instruments.” The results of the pres- 
ent study contrast with those reported for 
nonforced-choice type instruments in resist- 
ance to being faked. 


Received August 8, 1955. 


References 


. Gordon, L. V. Manual, Gordon Personal Profile. 
Yonkers, N. Y.: World Book, 1953. 

2. Longstaff, H. P., & Jurgensen, C. E. Fakability 
of the Jurgensen Classification Inventory. J. 
appl. Psychol., 1953, 37, 86-89. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


Evaluation of Angular Digits and Comparisons with a 
Conventional Set ' 


P. J. Foley 


Defence Research Medical Laboratories, Toronto 


Lansdell (4) has studied the legibility of 
two sets of conventional type digits, those of 
Mackworth (1) and Mound (4), and a set 
of digits designed to make maximum use of 
easily discriminated forms. When compared 
under poor viewing conditions which gave 
51.5 per cent correct identification of the 
conventional sets, the new digits were cor- 
rectly identified 67.4 per cent of the time. 
The design of the new set of digits is such 
that they are recognized as numbers with as 
little as three presentations. 

Lansdell has since made some revisions. 
This is a report upon four experiments car- 
ried out with the revised set to answer the 
following questions: 


1. What are the confusion errors? 

2. Is the legibility of these digits independ- 
ent of whether they are presented as black 
figures on a white ground, or as white figures 
on a black ground? 

3. Is this set more legible than a typical 
conventional set under varied conditions of 
exposure and illumination? 

4. Is this set more legible than a typical 
conventional set when the digits are viewed 
obliquely? 

The conventional set chosen for compari- 
son was that of Mackworth (1). This set 
was decided upon because it shows consist- 
ently high performance in comparisons made 
by other investigators (3, 4, 6, 7). 


Method 


General procedure. The procedure common to all 
experiments is as follows: 

The digits were presented singly to the subjects 
(Ss) who viewed them at a distance of 20 feet. The 
exposure time, rate of presentation, and illumination 
level were controlled by the experimenter. 

The Ss sat within a boxlike structure, so that the 
field of view was restricted to the screen and its im- 


1 Defence Research Medical Laboratories Report 
No. 76-2, Project No. D77-94-20-21, (H.R. No. 
117). 


mediate surround. The room itself was dark. Re- 
sponses were noted by an assistant seated just out- 
side the box. 

Apparatus. A magazine-load automatic slide pro- 
jector was used. This was modified so that the 
shutter remained open as the slides were changing, 
ensuring continuous illumination on the screen inde- 
pendently of whether or not a digit was present. 
Each digit thus appeared first as a blur on the screen, 
was brought sharply into focus for the period of ex- 
posure, then disappeared, giving place to the blur of 
the succeeding digit. Preliminary experiments showed 
that this blur had no effect on subsequent legibility. 

The projector was connected to two interval timers 
such that the exposure time and the interval be- 
tween exposures were independently controlled. 

Illumination on the screen was varied by placing 
neutral density filters in front of the projector lens. 
The required values were determined empirically by 
measuring the resultant illumination on the screen. 
These measurements were made using the Macbeth 
Illuminometer. 

The screen, 3 X 3 in., was made of white Bristol 
board and was mounted so that it could be made to 
rotate about its vertical axis. 

The digits used are shown in Fig. 1. 
tions were as follows. 


Specifica- 


1. Mackworth: height/width ratio was 2:1. Stroke 
width was constant and equal to 12.5 per cent of 
height. 

2. Revised Lansdell: vertical and horizontal tan- 
gents of each digit, with the exception of the digit 
“1.” form a rectangle with a height/width ratio of 
2:1. The digit “1” has a height/width ratio of 
13.3:1.2 

All digits were mounted singly on 2 » 
slides, and when projected on the screen were 
high. 

Subjects. 


2-in. glass 
tg in. 


The Ss’ ages ranged from 18 to 37 years. 
All had 20/20 or better binocular acuity at a dis- 
tance, as tested on the U. S. Armed Forces Vision 
Tester. Before every experimental session Ss were 
shown each digit three times, under the conditions 


2A drawing of each of these digits and results of 
the statistical analyses for this and succeeding ex- 
periments have been deposited with the American 


Documentation Institute. Order Document No. 4835 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D. C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


178 





Evaluation of Angular Digits 


L 7 os 1 © 


J 4@ 4 


23 4 ~=9 


Fic. 1. 


obtaining in that session, and were told the identity 
of the digit before each presentation. 


Experiment 1: Confusion Errors of Revised 
Lansdell Digits 


Digits were exposed singly for .6 second with a 
presentation rate of one digit every three seconds. 
Each S was given 300 presentations, 30 for each 
digit. The order was random. The illumination 
level on the screen was ten foot-candles. There were 
15 Ss, who had no previous experience with the 
Lansdell digits. 


Results.* The specific confusions which 
contributed more than 5 per cent to the total 
error were the 3 with 5 (6.12%), the 3 with 
7 (6.0%), the 5 with the 3 (6.6%), the 9 
with the 5 (5.2%), and the O with the 8 
(8.9%). 


Experiment 2: Revised Lansdell Black on 
White vs. White on Black 


Revised Lansdell digits, black on white, were com- 
pared with revised Lansdell digits, white on black, 
at three illumination levels, 10, 30, and 50 foot- 
candles. Digits were exposed singly for .5 second, 
with a presentation rate of one digit every three 
seconds. The design used was a 3 X 2 factorial, giv- 
ing six conditions per S. Each S was presented with 
each digit three times under each condition, giving a 
total of 30 presentations per S, per condition. 
were presented randomly to each S under each con- 
dition, and the conditions were also presented ran- 
domly. The Ss were given five minutes preadapta- 
tion for each illumination level. There were 10 Ss 
drawn from the 15 used in Experiment 1. 


Results. Analysis of variance shows that 
differences between Ss are significant (P < 
.01). None of the subject interactions are 
significant, however, indicating that the re- 
sponses of all Ss are in the same direction over 
all conditions. The interaction between digit 


8 See footnote 2. 


Digits | 


6 7 8 7 0 


Lansde!l digits and Mackworth digits. 


type and illumination level is significant (P 
< .01), showing that the legibility of the 
revised Lansdell digits is not independent of 
whether they are presented as black figures on 
a white ground or as white figures on a black 
ground. If the illumination level is of the 
order of 10 foot-candles, then white digits on 
a black ground are more legible; if the illumi- 
nation level is from 30 to 50 foot-candles, 
then black digits on a white ground are more 
legible. 


Experiment 3: Revised Lansdell Digits vs. 
Mackworth Digits 


Revised Lansdell digits, black on white, were com- 
pared with Mackworth digits, black on white at 
three illumination levels, 10, 30, and 50 foot-candles, 
and at three exposure times, .3, .8, and 1.3 seconds. 
Digits were exposed singly, with a presentation rate 
of one digit every three seconds. The design was a 
3X32 analysis of variance giving 18 conditions 
per S. Each S was presented with each digit three 
times during each condition, giving a total of 30 
presentations per S per condition. Digits were pre- 
sented randomly to each S under each condition. 
In order to decrease the length of each session by 
avoiding pre-adaptation periods between conditions, 
and since the effect of illumination level on both 
digit types was known, a split-plot design was used 
(2). This design confounded illumination with ses- 
sions, but did not affect the main comparison of ex- 
posure time with digit type. Six Ss from Experi- 
ment 2 were used. One of the six possible orders of 
the three illumination levels was allotted to each S. 
The conditions of exposure with digit type were 
randomized within each session. 


Results. Analysis of variance shows that 
the difference between the legibility of the re- 
vised Lansdell digits and the Mackworth 
digits is highly significant (P < .01). 

The interaction between exposure and digit 
type is not significant. There is a highly sig- 





180 


nificant increase in percentage correct from 
Exposure 1 to Exposure 3 (P < .01). There 
is no evidence of departure from linearity 
when Exposure 1 and Exposure 3 are com- 
pared with Exposure 2. 

Similarly for illumination levels—as illumi- 
nation increases from 10 to 50 foot-candles— 
there is a highly significant increase in per- 
centage correct (P < .01). There is no evi- 
dence of departure from linearity. 


Experiment 4: Revised Lansdell Digits vs. 
Mackworth Digits at Different Angles of 
View 

Revised Lansdell digits, black on white, were com- 
pared with Mackworth digits, black on white, at 
three angles of view, 45° left, normal, and 45° right. 

Digits were exposed singly for .8 second, with a 

presentation rate of one digit every three seconds. 

The illumination on the screen was 30 foot-candles. 

The design was a 3 X 2 analysis of variance giving 

six conditions per S. Each S was presented with 

each digit three times during each condition, giving 

a total of 30 presentations per S, per condition. 

Digits were presented randomly, as were conditions. 

There were five Ss, all of whom had been used in the 

previous experiments. 


Results. The revised Lansdell digits are 
significantly more legible than the Mack- 
worth digits under these conditions (P < 
.01). There is no interaction between digit 
type and viewing angle. The degrees of free- 
dom for angles of view were broken up and 
the following comparisons made: (a) 45° R 
— 45° L; the difference is not significant, in- 
dicating that it does not matter from which 
side the digits are viewed. (0b) 45° R+ 45° 
L — 2 (normal); the difference is significant, 
and shows a decrease in legibility between 


P. J. Foley 


the normal and the oblique angle of view. 
None of the interactions approach signifi- 
cance. 


Summary and Conclusions 


A new set of digits designed to make maxi- 
mum use of easily discriminated forms was 
studied. Data on confusion errors are given. 
The legibility of the new digits is not inde- 
pendent of whether they are presented as 
black on a white ground or as white on a 
black ground. At low illumination levels 
white on black is more legible, the reverse 
being true at high illumination levels. Com- 
parisons with a conventional set, the Mack- 
worth digits, at different illumination levels, 
exposure times, and angles of view, show the 
new set to be significantly more legible under 
all of these conditions. 


Received September 19, 1955. 


References 


. Bartlett, F.. & Mackworth, N. H. Planned seeing. 
London: H. M. Stationery Office, 1950. 

. Cochran, W. G., & Cox, G. M. Experimental de- 
signs. New York: Wiley, 1950. Ch. 7. 

. Crook, M. M., & Baxter, F. S. The design of 
digits. USAF, WADC Tech. Rep., 1954, No. 
54-262. 

. Lansdell, H. The effect of form on the legibility 
of numbers. DRML Report No. 76-1, Canad. 
J. Psychol., 1954, 8, 77-79. 

. Quenouille, M. H. Introductory statistics. 
don: Butterworth-Spring, Ltd., 1950. 

. Reinwald, F.L. Design of visual displays. Hamil- 
ton, N. Y.: Colgate University, 1953. (Dept. 
of Psychol., Prog. Rep. No. 3, Contract AF 
30 (602)-212.) 

. Schapiro, H. B. Factors affecting legibility of 
digits. USAF, WADC Tech. Rep., 1952, No. 


52-127. 


Lon- 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


Dimensional Analysis Of Motion: IX. Comparison of 
Visual and Nonvisual Control of Component 
Movements * 


Janet Huiskamp, Robert C. Smader, and K. U. Smith 


University of Wisconsin 


This is a study of perception and human 
motion. In the investigation an attempt is 
made to determine the role of perceptual fac- 
tors in the determination of the component 
movements making up a skilled pattern of 
motion. 

The study to be described here deals with 
a comparison of visuai and nonvisual control 
of the component movements of manipulation 
and travel in a panel control task. The re- 
sults of the study, as it has been planned, 
have a bearing on a number of problems. 
The data obtained on skilled performance 
provide information on the nature of blind 
movements. Results on learning under the 
different perceptual conditions suggest cer- 
tain theoretical and practical relations be- 
tween perception and motion. In addition, 
data on transfer effects are interpreted in re- 
lation to the problems of the organization of 
motion in work. 


Method 


Apparatus ‘ 

Electronic methods of motion analysis are used in 
this experiment to measure precisely the duration of 
component movements involved in a panel control 
task. These methods have been described in detail 
before (1, 2) and will be discussed here only in a 
general way. 

The apparatus is presented schematically in Fig. 1. 
There are five rows of switches or knobs on the 
work panel, each of which turns only to the right 
and through an are of about 45 degrees. The dis- 
tance between all adjacent knobs, horizontally and 
vertically, is the same. Only four rows of knobs 
with the four center switches in each row were used 
and the remainder were masked. Each knob on the 
panel was comfortably within reach of the S, whose 
task was to turn each of the 16 knobs one after the 
other as quickly as possible. 

The internal housing of the apparatus is behind 
the panel and out of S’s view. It consists of an 
electronic relay which is on a current level of sub- 

1 This research was supported by funds provided 

‘by the National Science Foundation for the project 
“Perception and Human Motion.” 


threshold value for the human skin. Connections 
are made between the human operator and this relay 
by means of an electrode held in S’s left hand. Con- 
nections are also made between the relay and the 
knobs to be manipulated so that when S comes in 
contact with any knob, the circuit is completed. 

The two clocks used in this apparatus record ma- 
nipulation time and travel time separately. The ma- 
nipulation clock starts recording as soon as the first 
knob in a pattern is contacted or manipulated, and 
stops at the moment S releases that knob. As soon 
as this first knob is released, the travel clock starts 
and continues to run until the next knob is grasped. 
Then the travel clock stops and the manipulation 
clock starts again. The two different movement 
times are accumulated separately on the two clocks. 
All recording starts with the manipulation of the 
first knob contacted, which can be anywhere on the 
board, and ends with the manipulation of the final 
knob, which can be located as desired by means of 
an “ending” plug. This terminal knob stops both 
clocks at the end of the trial. 


Experimental Design and Procedure 


The general design of this experiment is indicated 
in Table 1. Twenty-four university students were 
divided into two matched groups on the basis of pre- 
test scores. The groups were treated identically in 
all experimental conditions except that one group 
performed the task visually and the other group 
performed it blindfolded. These two groups will be 
referred to hereafter as the visual and the blind 
groups. The design covers both practice and trans- 
fer effects. As shown in Table 1, Ss were first given 
a pretest on both conditions, then had 10 days of 
practice in either the visual or blind condition. On 
the twelfth day, transfer trials were run. 

Each S performed the knob-turning task in four 
different directions. The directions used are, A, from 
left to right, B, from right to left, C, from bottom 
to top, and D, from top to bottom. The effects of 
sequence of presentation of these different directions 
of movement were controlled by using a replicated 
latin-square design. 

Before starting the experiment, each S was pre- 
sented with a schematic diagram of the arrangement 
of the knobs. The four directions of movement were 
illustrated and the nature of the task was explained. 
Each S was told that he would be blindfolded dur- 
ing his first experience with the apparatus. 

Without having seen the apparatus, each S per- 














_ 
_ 


| 
| 
| 
wail 





INTERNAL 
HOUSING 

















ELECTRONIC 
RELAYS 


























UNIVERSAL CONTROL 
PANEL 


Fic. 1. 


formed blindfolded 12 trials, three trials in each di- 
rection of movement. All Ss performed an ABCD 
sequence. Before the first trial of a new direction 
of movement, the experimenter took S’s hand and 
traced, by touching the knobs, the directional pat- 
tern to be followed. When the blind performance 
was completed, the blindfold was removed and S re- 
peated his performance visually. By means of this 
pretest, Ss were matched and one of each pair was 
assigned to the blind condition, the other serving in 
the visual condition. 

On Day 2, each S performed a particular sequence 
of directional trials assigned to him by the replicated 
latin-square design. From Day 2 to Day 11, each 
S received practice under his perceptual condition. 
Practice consisted of three trials in each of the four 
directional patterns. The median score for each di- 
rection was taken as his score for the day. 

On the twelfth day, transfer tests were run. In 
these transfer tests, Ss who had practiced visually 


Table 1 


General Design 








Pretest—Day 1 
All Ss perform 12 blind and 12 visual trials. 
Used for matching. 
Learning—Day 2 through 11 
Group I—12 blind trials for 10 days 
Group II—12 visual trials for 10 days 
Transfer—Day 12 


Group I—12 visual trials 
Group II—12 blind trials 











TERMINAL 
BOARD 


MANIPULATION 
TIME 


TRAVEL 
TIME 


Diagram of the preplanned work panel and the electronic motion analyzer. 


now performed blindfolded, and those who had prac- 
ticed blind now performed visually. 


Results ” 


The results of this experiment will be dis- 
cussed in terms of differences between the 
visual and blind groups in relation to the 
following: (a) skilled performance on the 
eleventh day of practice, (6) acquisition of 
skill as a function of practice, and (c) trans- 
fer effects. 


Skilled Performance 


The differences between the visual and 
blind groups on both the manipulative and 
travel components of the task are shown 
graphically in Fig. 2. In this bar graph, the 
data for the visual and blind groups are 
shown separately for the two different com- 
ponent movements. The mean duration of 
both movement components is significantly 
greater for the blind group. 

The results of the analyses of variance of 


2 The summaries of the analysis of variance of the 
different parts of the data, together with a summary 
of the critical data on which these analyses are based, 
are on file with the American Documentation Insti- 
tute. Order Document No. 4776 from ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.75 for microfilms or $2.50 for photo- 
copies. Make checks payable to Chief, Photodupli- 
cation Service, Library of Congress. 





Dimensional Analysis of Motion: IX. 


Table 2 


Analyses of Variance for the Manipulative and Travel 
Times of Skilled Performance on Day 11 








Manipulation Time 


Source 


Mean 
Square 





Group 

Direction 

Trials 

Ss/Groups 

Group X direction 
Group X trial 
Error 


Total 


Group 

Direction 

Trials 

Ss/Groups 

Group X direction 
Group X trials 
Error 


Total 


117.3953 
.2448 
1788 

2.5413 
.2199 
0501 
1025 


Travel Time 


12.4272 
1.1097 
.0684 
1.2582 
.6248 
.1198 
.0861 


F 
46.20** 
2.39 
1.74 
24.80** 
2.14 


9.88** 
12.89** 


14.61** 
7.26** 
1.39 





** Significant beyond the 1% level. 


the data on which Fig. 2 is based are shown 
in Table 2. In addition to the differences in- 
dicated in Fig. 2, the analysis shows that the 


SECONDS 


LJ Visual 














1) 
Manipulation 
Fic. 2. 
visual and 
travel. 


Travel 


Bar graph showing the difference between 
blind conditions of manipulation and 


the travel component of the task. The sig- 
nificant group-by-direction interaction for the 
travel component indicates that the four di- 
rectional patterns were affected differently by 
the two perceptual conditions. 


Acquisition 


Figure 3 shows graphically the course of 
learning for the 11 days of practice. In this 
figure, the curves for the manipulative com- 


direction of movement significantly affected ponent are presented to the left, those for 


Seconds Seconds 


Visual e—e—e—e 


Blind « 


e e —e—e—e-~, e 


o— 
@—e—@e— e— © —e-—e— —_ ®— © 











5 5 
ooys 


Manipulation Travel 


Fic. 3. Learning curves for manipulation and travel under the visual and blind conditions. 





184 


travel to the right. The two different curves 
for each component movement represent the 
perceptual conditions used. It can be ob- 
served that the difference between the visual 
and blind groups for both manipulative and 
travel components is greater on the first day 
than on any succeeding day. This difference 
is not only the result of the perceptual con- 
ditions, but also reflects the fact that on 
Day 1, the pretest day, the blind Ss had 
never had any contact with the apparatus 
while the visual Ss had had 12 blind trials 
preceding their visual trials, which appear on 
the graph. 

In analyzing the learning data, difference 
scores were obtained for each S by subtract- 
ing the scores made on Day 11 from the scores 
made on Day 2. Separate ¢ tests were run 
on the visual and blind groups for manipula- 
tion and travel scores to determine whether 
the difference between Day 2 and Day 11 was 
significantly greater than zero. The results 
indicate that the visual and the blind groups’ 
performance on the manipulative component 
of the task was improved significantly by 
practice. The ¢ value was significant beyond 
the 1 per cent level. The blind group showed 
a learning effect in the travel component 
which was significant at the 5 per cent level, 
while the visual group showed no significant 
learning effect in this component. 


Transfer 


We have examined the problem of transfer 
of training in this study in order to determine 
whether the effects of visual and blind learn- 
ing have different influences on performance 
on the nontrained task. By the electronic 
methods used, we are able to determine 


Table 3 


Mean Duration in Seconds for Skilled Performance 
and Transfer Performance 





Day11 Day12 


Condition Component Training Transfer 


Visual Training— 
Blind Transfer 


6.65 
4.67 


Manipulative 
Travel 


Blind Training— 
Visual Transfer 


3.87 
3.49 


Manipulative 
Travel 





Janet Huiskamp, Robert C. 


Smader, and K. U. Smith 


whether these influences affect either the ma- 
nipulative or the travel component of the 
task, or both. 

Transfer of training has been considered in 
two ways in this study. First, in the tradi- 
tional manner, pretest and transfer test scores 
for the two groups were compared to see if 
the two perceptual conditions in training had 
a differential effect. That is to say, the visual 
pretest and the visual transfer test for the 
blind-training group were compared with the 
blind pretest and the blind transfer test for 
the visual-training group. Second, the scores 
on the last day of training and the scores on 
the transfer-test day have been compared to 
see if the change in perceptual conditions was 
reflected in a performance change. 

An analysis of covariance was used to com- 
pare the pretest and transfer-test scores for 
the two groups. A simple analysis of variance 
showed that there was a statistically signifi- 
cant difference between the two groups on the 
pretest day, indicating that the group per- 
forming visually was superior for both ma- 
nipulation and travel. The same results were 
obtained from a simple analysis of variance 
of the transfer test scores. These results 
would be expected because of the two radi- 
cally different perceptual conditions. By 
means of the analysis of covariance, the pre- 
test difference between the groups was re- 
moved so that any differences remaining be- 
tween the two groups could be attributed to 
differences in the training conditions. How- 
ever, when the pretest differences were re- 
moved, the difference between the two groups 
on the transfer test was not significant. This 
finding indicates that the training in neither 
perceptual condition was more helpful in 
transfer to the other condition. 

Table 3 shows the mean scores obtained by 
the groups on Day 11 and Day 12. It is ap- 
parent from the table that a change in per- 
ceptual conditions profoundly affects perform- 
ance, even after training. The visual group, 
when performing blind on Day 12, shows an 
increase in the mean duration of both travel 
and manipulation times. The blind group, 
on the other hand, shows a decrease in mean 
duration of these times when performing 
visually on Day 12. Analyses of variance 





Dimensional Analysis of Motion: 1X. 


indicate that these changes are significant for 
both groups for both the manipulative and 
travel components. 

Table 4 presents the data in Table 3 re- 
arranged for ease of comparison between the 
two groups when performing the same task. 
The blind task scores are those for the blind- 
trained group on Day 12. The visual task 
scores are those for the visually-trained group 
on Day 11 and the blind-trained group on 
Day 12. The analyses of variance indicate 
a significant difference at the 5 per cent level 
in favor of the blind-trained group on the 
blind task for both manipulation and travel. 
There was no significant difference between 
groups on the visual task. 


Discussion and Summary 


The role of perception in human motion is 
a broad problem touching upon many theo- 
retical and applied aspects of psychology. 
Currently this problem is a very lively one in 
the fields of human engineering and of time 
study in industry. We have made one spe- 
cial approach to this general problem in the 
present investigation in terms of observations 
of the effects of visual and blind conditions 


of performance upon the different component 


movements in human motions. In this ex- 
periment the problem of the role of perception 
in performance is brought into relation to the 
whole field of motion analysis. 

In order to conduct the present research, 
special electronic methods of motion analysis 
are applied to the measurement of the ma- 
nipulation and travel components of move- 
ment in a panel control task. The durations 
of these movements are measured separately 
under blind and visual conditions of perform- 
ance. 

The significant difference between the lev- 
els of skilled performance for the visual and 
blind groups for both components shows that 
in the present case perceptual conditions pre- 
dominate in the performance. Practice under 
the blind condition does not compensate for 
the loss of vision. It is unlikely that further 
practice would enable the blind group to equal 
the performance of the visual group since the 
learning curves appear to have leveled off. 

The results of the comparison of visual and 


185 


blind skilled performance show that percep- 
tion does not have a particular role in deter- 
mining one aspect or component part of the 
task. Perceptual factors appear to affect both 
manipulation and travel. The blind indi- 
vidual is restricted in fine movements and in 
gross travel movements. 

The data on acquisition indicate that both 
groups showed a significant improvement in 
performance on the manipulative component 
as a function of practice. A significant im- 
provement in travel movement with practice 
occurs only with the blind group. This im- 
provement may be explained most directly in 
terms of the perceptual difficulty of the blind 
task. Previous studies have shown that per- 
ceptual “loading” of a task will give rise to 
learning in the travel component which is 
otherwise not present to any great extent 
(3, 4). 

In the case of manipulation, loss of vision 
does not seem to affect the course of learn- 
ing, but prevents the attainment of a skill 
level equal to that achieved by the use of 
vision. If a little speculation is allowed, it 
can be said that manipulation under blind 
conditions would never reach the level of 
visually controlled manipulation in this task. 

The transfer data indicate that training 
under one perceptual condition is not more 
beneficial than training under the other when 
the groups are tested on the nontrained con- 
dition. Visual training facilitates blind per- 
formance and blind training facilitates visual 
performance, but the analysis of covariance 
does not indicate that there is a difference be- 


Table 4 


Comparison of Performance of Visual and Blind Groups 
Performing the Same Task in Terms of 
Mean Duration in Seconds 


Blind- 
Trained 
Group 
Day 11 
5.87 
4.12 


Visually 
Trained 
Group 


Day 12 
6.65 
4.67 


Blind 
Manipulation 
Travel 


Visual Day 12 
3.87 


3.49 


Day 11 
3.66 
3.51 


Manipulation 
Travel 








186 


tween the performance of the two groups 
which can be attributed to the perceptual dif- 
ference in training. 

The design of the transfer part of this 
study has provided the means of getting data 
on the problem of the automatizing of com- 
ponent movements in a learned motion. We 
wish to know whether either of the com- 
ponent movements in the task have been 
made automatic through learning to the ex- 
tent that change in the perceptual conditions 
will not change their level of performance. 
The data clearly show that neither of the two 
component movements was automatized to 
this extent. When changed to the blind con- 
dition, the visually trained group showed a 
deterioration in performance. When tested 
visually, the blind group showed an improve- 
ment in performance. The perceptual con- 
ditions as such outweigh any automatic fea- 
tures in the two movements resulting from 
learning. 

The role of perception in motion is one of 
the outstanding problems of industrial work. 
Long-standing theory and practice related to 
motion and time study in industry has dealt 
with this problem in only a superficial way; 


i.e., by the assumption of specific perceptual 


therbligs of search and select. The present 
study points up certain important facts about 
the role of perception in work motions. Per- 
ceptual conditions appear to define in a main 
way the specific properties of different ther- 


Janet Huiskamp, Robert C. 


Smader, and K. U. Smith 


bligs or parts of the work task, and not just 
one part. Furthermore, the general per- 
ceptual conditions involving use of vision and 
its absence seem to have a predominant role, 
in comparison to learning, in defining the ab- 
solute level of performance of the different 
component movements of manipulation and 
travel in skilled motion. 

The problems of motion analysis in in- 
dustry heretofore have not been brought into 
close relation with the varied phenomena of 
perception that have been studied in psychol- 
ogy. This experiment not only serves to em- 
phasize the importance of such scientific in- 
vestigation, but it also proves that electronic 
methods of motion analysis make possible the 
detailed study of motion in relation to per- 
ceptual factors. 


Received July 21, 1955. 


References 


. Davis, R., Wehrkamp, R., & Smith, K. U.  Di- 
mensional analysis of motion: I. Effects of 
laterality and movement direction. J. appl. 
Psychol., 1951, 35, 363-366. 

. Rubin, G., von Trebra, P., & Smith, K. U. Di- 
mensional analysis of motion: III. Complexity 
of movement pattern. J. appl. Psychol., 1952, 
36, 272-276. 

. Seymour, D. Manual skills and industrial produc- 
tivity. Production Engineers J., 1954, 3-10. 

. Smader, R. The relation between perceptual com- 
plexity in a task and human motion. Unpub- 
lished doctor’s dissertation, Univer. of Wis- 
consin, 1955. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


A New Technique for Rapid Item Analysis ' 


Carlos A. Cuadra 


Veterans Administration Hospital, Downey, Illinois 


The resurgence of interest in self-report 
techniques during the past 15 years has been 
closely associated with, if not one result of, 
an increasingly empirical approach to per- 
sonality assessment. With a number of 
standardized research instruments (2, 3, 4) 
providing a reservoir of items to tap impor- 
tant characteristics and attitudes, the horizons 
of empirical research are limited only by the 
fertility of the experimenter’s imagination, 
the availability of stable criterion groups, and 
—of special importance—facilities for devel- 
oping new measures through item analysis. 
Unfortunately, the development of new em- 
pirical scales is hampered by the inaccessi- 
bility to most clinician-researchers of the 
modern electronic equipment which can re- 
duce a laborious task to manageable propor- 
tions. 

The purpose of this paper is to describe a 
simple new method to increase the speed and 
accuracy of item analysis without elaborate 
or expensive equipment. The method, thus 
far applicable only to Hankes-type (or Test- 
scor) answer sheets, involves the transferring 
of individual item responses to two or three 
specially designed 8 by 8-inch cards from 
which item tallies can quickly be made. Fig- 
ure 1 shows one such card designed by the 
author and named the Item Record Card. 
The spaces along the edges of the card corre- 
spond to the spaces on the Hankes Answer 
Sheet. 

With an Item Record Card placed directly 
under the first row of “true” responses on 
the answer sheet, these responses may be en- 
tered rapidly on the card by means of diago- 
nal marks. At the end of the first row of 30 
items, the card is turned 90 degrees and the 


1Sponsored by the Veterans Administration and 
published with the approval of the Chief Medical 
Director. The statements and conclusions published 
by the author are the result of his own study and 
do not necessarily reflect the opinion or policy of 
the Veterans Administration. 


187 


next 30 items entered. When 120 items have 
been completed, the card is turned over and 
a second 120 items entered on the reverse 
side. A 472-item protocol from the Cali- 
fornia Psychological Inventory, for example, 
can be completely transferred to the two 
Item Record Cards needed in less than two 
and one-half minutes. 

Once a series of protocols is entered on 
cards, an item tally is quite simple. The 
cards are laid in a column, exposing only the 
marked edges, and the total number of “true” 
responses to each item may be obtained easily 
by running down the successive columns. 

The method described has a number of im- 
portant advantages over other nonmechanical 
methods. First and perhaps most important 
is the saving in time and personnel. An 
ordinary item analysis of, say, 25 versus 25 
MMPI protocols usually takes one worker 
from 15 to 20 hours. With two persons— 
one reading the “true” responses while the 
other records—the job can be reduced to 
from 12 to 15 man-hours. Use of the new 
method described allows one person to do the 
work in only four to five hours. 


| | 


r 





The Item Record Card 
A Technique for Rapid Item Analysis 
on Hankes-type Answer Sheets 
by 
CARLOS A. CUADRA, PhD. 


2 Age Sem 


TUCUUEUCUCUUUPUCUDUUEUUUUEUCS 


HHHHE 
LEE 


PAPER ELE LAST eRe 











eT ve 
TUVUVUCVEWTVUVUNUVUOSTIVEUS 


Fic. 1. The Item Record Card. 





188 


A second advantage lies in the method of 
recording. Since it is entirely visual and does 
not involve reading out and recording indi- 
vidual item numbers, the possibility of read- 
ing or recording error is sharply reduced. 

A third advantage may accrue whenever 
more than one criterion is used for the same 
subjects. In such instances the subjects’ 
cards need only be re-sorted before tallying 
the responses of the new criterion group. For 
example, in a study in progress on the selec- 
tion of psychiatric aides, the two important 
criterion variables were (a) tenure, and (0d) 
good ward performance. Although separate 
item analyses were carried out for each vari- 
able, a number of long-tenure aides quite 
naturally also fell in the good performance 
group, and since their item responses were 
already available on Item Record Cards, it 
was not necessary to record them again. 
Once entered on the cards, subjects’ responses 
may without further effort become part of 
any number of possible criterion groups. 

One cautionary note should be registered 
at this point. There may be some tempta- 
tion to utilize the Item Record Card itself as 
an answer sheet, with the subject simply 


checking the items to which his response is 
“true.” While this would certainly eliminate 


Carlos A. 


Cuadra 


one operation from the steps involved in item 
analysis, the change from a true-false to a 
check—no-check task might introduce serious 
distortions into the data. As pointed out in 
another article (1), the reduction of any psy- 
chological rating task to a check-list method 
introduces a number of important and for 
the most part unmeasurable psychological 
variables into the rating situation. Used only 
as suggested above, however, the Item Record 
Card may prove a useful adjunct to empirical 
research with personality questionnaires and 
other self-rating techniques. 


Received August 8, 1955. 


References 


. Cuadra, C. A., & Reed, C. F. Problems in the 
interpretation of check-list data. Unpublished 
study. Author, 1956. (Mimeographed) 

. Gough, H. G. A preliminary guide for the Cali- 
fornia Psychological Inventory. Berkeley: 
Univer. of California Institute of Personality 
Assessment & Research, 1954. Pp. 1-55. 
(Mimeographed) 

. Hathaway, S. R., & McKinley, J. C. A multi- 
phasic personality inventory (Minnesota): I. 
Construction of the schedule. J. Psychol., 
1940, 10, 249-254. 

. Jurgensen, C. E. Report on the “Classification 
Inventory,” a personality test for industrial 
use. J. appl. Psychol., 1944, 28, 445-460. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


A Methodological Note on Time Intervals Between 
Consecutive Accidents 


Alexander Mintz 


City College of New York 


It was pointed out in an earlier paper (2) 
that the study of time intervals between con- 
secutive accidents of individuals is a suitable 
method of establishing whether or not acci- 
dents have an influence on subsequent acci- 
dent proneness. This is an issue of some 
importance. The arguments in favor of in- 
dividual differences in accident proneness 
which make use of distributions of accident 
frequencies among people generally presup- 
pose that accident proneness is uninfluenced 
by accidents. On the other hand, Horn (1), 
who studied time intervals between accidents 
of airplane pilots, concluded that their acci- 
dent proneness tended to be temporarily in- 
creased by accidents and recommended read- 
justment procedures after accidents. In the 
earlier study, data pertaining to time inter- 
vals between accidents of taxi drivers‘ were 
analyzed by methods differing from Horn’s, 
and it was concluded that in their case there 
was no evidence of increased accident prone- 
ness after accidents. 

The methods of dealing with the data in 
the earlier paper involved comparisons of time 
intervals before the first accidents, between 
early accidents, and between later accidents 
of the same individuals. A different method 
of dealing with the data will be explained 
and demonstrated in this paper: It will be 
shown how the frequency distribution of time 
intervals between a particular pair of acci- 
dents may be compared to a theoretical dis- 
tribution of such time intervals; such a theo- 
retical distribution is derived from the as- 
sumptions according to which accident prone- 
ness is constant in time, and accidents occur 
purely at random over the time interval rep- 
resenting the observation period. The mate- 
rial prepared to illustrate an application of 
such a method will also be compared to the 


1 The data had been kindly contributed by Pro- 
fessor E. Ghiselli, University of California. 


material presented by Horn in his study of 
airplane pilots. 

Contrary to what one might suppose, the 
random distribution of a number of events 
over an observation period contains quite un- 
equal numbers of short time intervals and 
long time intervals between consecutive events. 
The earlier paper already cited (2) summa- 
rizes the mathematical rationale, lists refer- 
ences, and gives a formula for the probability 
distribution of time intervals of different 
durations. If m accidents happen to each of 
a number of persons during an observation 
period of duration D, the probability of time 
interval x between consecutive accidents 
within this period is given by the equation 


Te \otis 
y=n(1-5) ‘ 


The probability of x being between the limits 
x, and x2 is then given by the definite integral 


Ze x n—l 
J n(1-<) dx. 


If a group of people had varying numbers of 
accidents, the probability of the time interval 
between, e.g., their first and second accident 
being between x, and x2 is given by a weighted 
average of definite integrals of the above 
form. The weights correspond to the num- 
bers of people with varying n’s, i.e., with two 
accidents each, three accidents each, etc. One 
can then tabulate the numbers of time inter- 
vals between, e.g., second accidents falling 
within certain time limits after the first acci- 
dent, and compare these empirical frequencies 
to the theoretical ones, as just explained. 

As an example, the times between the first 
and second accidents of the group of taxi 
drivers already discussed in the earlier paper 
were examined by the new method. The ac- 
cidents in which the drivers were involved 
were listed as taking place in particular weeks 


189 





190 


of the year; as an approximation, accidents 
were assumed to happen in the middle of the 
week, and time intervals between the acci- 
dents were computed accordingly.  Sixty- 
seven of the total group of 162 drivers had 
two or more accidents. The numbers of 
drivers with two accidents, three accidents, 
etc. are listed in Table 1. 

The time intervals between the accidents 
were classified in four-week periods. The 
corresponding theoretical frequencies were 
computed as described earlier. Since the ob- 
servation period was one year, and four weeks 
is almost exactly one-thirteenth part of a 
year, D was set as equal to 1, and x; and x2 
in the definite integral became 0 and 1/13, 
1/13, and 2/13, respectively, etc. In accord- 
ance with the figures presented in Table 1, 
the formula used in order to compute the 
theoretical frequency of time intervals of less 
than four weeks between the first and second 
accidents was therefore 


16 xX x) dx 
x)? dx 


+13X:; 


x) dx 


1/13 
+13x4 f (1 


1/13 
+ 4x5 f (1 


For the computations of the theoretical 
frequencies of time intervals of longer dura- 
tion, e.g., five to eight weeks, the same for- 
mula was used, but with different pairs of 
limits for the definite integrals, e.g., 1/13 


x)' dx, 


etc. 


and 2/13, etc. The obtained and theoretical 
frequencies of the time intervals thus com- 
puted are presented in Table 2. 

It is apparent that the obtained and the 
theoretical frequencies are quite similar, and 
the chi-square test indicated that the differ- 
ences between them do not approach statisti- 
cal significance. Thus the conclusion of 


2In view of the nonsignificant differences between 
the theoretical and the empirical distributions, sea- 
sonal fluctuations that may have been present were 
not investigated. The small and nonsignificant ex- 
cess of less-than-four-weeks time intervals in the em- 
pirical distribution may have been due to this factor. 


Alexander Mintz 


apparent lack of an effect of accidents on 
accident proneness in this set of data, which 
was reached in the earlier paper, is confirmed 
by the new method. 

In addition, Table 2 exhibits another fact: 
Both in the case of the actual and the theo- 
retical frequencies there is a marked pre- 
ponderance of short time intervals over long 
ones. In the case of the theoretical fre- 
quencies, almost a third of the time intervals 
are less than one month, over one-half less 
than two months. This large number of 
short time intervals betwen consecutive acci- 
dents compared to long ones follows directly 
from the fact that the equation given earlier, 


ae ie 
o(1 _ *) 


is a monotonically decreasing function of x 
for all n’s greater than one. Any weighted 
sum of such functions also monotonically de- 
creases with x. 

In his study of airplane pilots Horn had 
used a similar preponderance of short time in- 
tervals over long ones as his principal argu- 
ment for the view that their accident prone- 
ness increased after accidents. The discus- 
sion of the preceding paragraph shows that 


Table 1 
Accident Distribution of 67 Drivers with 
Two or More Accidents 


Number of 
Accidents 


Number of 
Drivers 


16 
13 
13 


_ 


16 
18 
25 


— ot tt DOD ND OW YI 


Total 


a) 
~I 





Time Intervals Between Consecutive Accidents 


such a finding is entirely inconclusive. It can 
be duplicated by a theoretical distribution of 
time intervals based on the assumption that 
accidents occur at purely random times. In 
order to establish that accidents increase ac- 
cident proneness, one has to show that the 
excess of short intervals over long ones is sig- 
nificantly greater than is the case in the theo- 
retical distribution. Or else one can com- 
pare times between earlier and later accidents 
of the same people. In Horn’s paper there is 
no tabulation of the time intervals that would 
enable one to make the latter comparison. A 
tabulation of frequencies of pilots with vary- 
ing numbers of accidents during the observa- 
tion period is also lacking, so that the theo- 
retical random distribution of time intervals 
cannot be computed. In the absence of such 
information it is impossible to tell whether 
accident proneness of airplane pilots (unlike 
that of taxi drivers) does increase after acci- 
dents. 
Summary 


It is shown how information about num- 
bers of people with varying numbers of acci- 
dents, together with the assumption that acci- 
dents happen to people at random tiimes, may 


be used to compute a theoretical distribution 
of time intervals between consecutive acci- 
dents. An obtained distribution of such time 
intervals may then be compared to the theo- 
retical one. As an example, a distribution of 
time intervals between first and second acci- 
dents of a group of taxi drivers was ex- 
amined; it was not significantly different from 
the theoretical distribution. 

It is pointed out that in the theoretical dis- 
tribution of time intervals between consecu- 


Table 2 
Frequencies of Second Accidents of Taxi Drivers 
Occurring Within Various Periods of Time 
After the First Accident, and 
Theoretical Frequencies 
(Data collected by Dr. E. Ghiselli) 


Week After First 
Accident 


Actual and Theoretical* 
Frequencies 
(21.1) 
(13.0) 
(9.4) 
(6.5) 
(4.8) 
(3.7) 
(2.7) 
(1.9) 
(1.5) 
(1.0) 
(0.6) 
(0.3) 
(0.1) 


Ist to 4th 26 
5th to &th 11 
9th to 12th 

13th to 16th 

17th to 20th 

21st to 24th 

25th to 28th 

29th to 32nd 

33rd to 36th 

37th to 40th 

41st to 44th 

45th to 48th 

49th to 52nd 


Totals 66.6 


* The theoretical frequencies, shown in parentheses, 
computed by five-place logarithms and rounded off. 


were 


tive accidents, short time intervals are much 
more frequent than long ones. Horn had 
previously used a finding of this type as argu- 
ment for the view that airplane pilots become 
accident prone after accidents; the argument 
is entirely inconclusive. 


Received August 12, 1955. 


References 


. Horn, D. A study of pilots with repeated acci- 
dents. J. aviat. Med., 1947, 18, 440-449. 

. Mintz, A. Time intervals between accidents. J. 
appl. Psychol., 1954, 38, 401-407. 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


A Note on the “Fakability” of the Minnesota Teacher 
Attitude Inventory ' 


A. Garth Sorenson 


School of Education, University of California, Los Angeles 


This investigation was undertaken to dis- 
cover whether or not prospective teachers can 
deliberately change their response to the Min- 
nesota Teacher Attitude Inventory (MTAI) 
in such a manner as to improve their total 
scores significantly. A secondary question con- 
cerns the effect of signing versus not signing 
the answer sheet. Will students who sign 
their names be more or less inclined to fake 
than those who do not sign? ‘There are sev- 
eral reasons why such a study is important. 
Of immediate concern to the investigator was 
the question of whether the inventory, which 
is designed to predict the ability of teachers 
to effect harmonious interpersonal relations in 
the classroom (1, 2), is likely to be of value 
in the selection of candidates for teaching 
credentials. Obviously if it can be readily 
“faked,” its value as a selection device will be 
limited. In at least one study, evidence is 
presented to indicate that the MTAI can be 
faked by college students (3). In another 
study where the MTAI was administered at 
the beginning and at the end of a course, it 
was found that the students’ scores changed 
in the “right” direction. It was not deter- 
mined whether the course effected real changes 
in attitudes or merely made the students 
“test-wise” so far as the MTAI is concerned 


(4). 
Sample and Administrative Procedure 


The subjects of the present study were 406 pro- 
spective teachers, elementary and secondary, in the 
School of Education at U. C. L. A. About half 
were inventoried in the fall semester, 1954, the re- 
mainder in the spring semester, 1955. The prospec- 
tive elementary teachers were enrolled in two sec- 
tions (one each semester) of a class in child growth 
and development. The prospective secondary teach- 
ers were enrolled in four sections (two each semes- 


1 This study was supported in part by the Fund 
for Occupational Research of the School of Educa- 
tion, U. C. L. A. Martin S. Sheldon assisted in the 
gathering and tabulation of data. 


ter) of a class in principles of guidance. Since the 
above courses are required of the respective groups 
of candidates for teacher credentials, the sample is 
probably representative of students in the School of 
Education at U. C. L. A. 

The inventory was administered during a regular 
class hour. No advance announcement was made. 
As each student entered the classroom, he was handed 
the inventory, an electrographic pencil and two an- 
swer sheets. The answer sheets, bearing duplicate 
numbers, were labeled “A” and “B.” He also re- 
ceived one of two sets of typewritten instructions. 
One set of instructions, distributed to alternate stu- 
dents as they entered the classroom, included these 
directions: 

“The purpose of this exercise is to furnish the in- 
structor with information regarding some of the 
attitudes of this class. You have been given two 
answer sheets. Be sure to put your name on both 
of them. Please indicate your age and sex. Then 
put the “B” answer sheet aside. When you have 
finished filling out the inventory, hold up your hand.” 

The other half of the students received the same 
instructions except they were told it would not be 
necessary to sign their names. 

As soon as a student held up his hand to signify 
completion of the inventory, his “A” answer sheet 
was collected and he was handed a set of instruc- 
tions for the “B” answer sheet, reading as follows: 

“Will you please imagine yourself to be an ap- 
plicant for a teaching position in a school system 
which is known to prefer ‘progressive’ teachers and 
fill out the inventory in such a way as to make a 
good impression.” 


Results 


The effect of instructions to make a good 
impression, i.e., to fake, was studied by group- 
ing the data as indicated in Table 1. It will 
be noted that in each of the groups the mean 
faked score is higher than the mean original 
score and that the difference in means is in 
each case significant at the .001 level of 
confidence. 

A comparison of the unsigned and signed 
answer sheets produced the following data: 
The mean original score for 204 students who 
did not sign the answer sheet was 41, SD 29. 
Their mean faked score was 70, SD 28. 

The mean original score for 202 students 


192 





“Fakability” of the MTAI 


who signed the answer sheets was 46, SD 28. 
Their mean faked score was 71, SD 29. 

The difference in mean original scores, 
signed versus unsigned was 5, critical ratio 
1.79, level of confidence .05, one-tailed test. 

The difference in mean faked scores, signed 
versus unsigned was 1, critical ratio .36, not 
statistically significant. 

The effect of signing one’s name versus not 
signing was further checked by the following 
procedure. After scoring, the original and 
faked answer sheets were paired according to 
the duplicate numbers on the answer sheets. 
In each case, the original or “A” score was 
subtracted from the “B” or faked score to 
give a “gains” score. The gains scores were 
tabulated separately for the signed and un- 
signed groups. 

The range of gains scores on the 204 un- 
signed answer sheets was from — 59 to 147 
with a mean of 30.8 and an SD of 28.4. 

The range of gains scores on the 202 
signed answer sheets was from — 53 to 111 
with a mean gain of 23.7 and an SD of 23.7. 

The difference in gains scores for the two 
groups was thus 7.1, critical ratio 2.73, level 
of confidence .01. 


Discussion 


When instructed to fake the MTAI, many 
of the students in the above samples made 
very large gains over the scores they had 
achieved under standard directions. A few 
made very large losses. (Perhaps the latter 


193 


assigned a different meaning to the term 
“progressive” than did the former.) The 
majority improved their scores, as indicated 
by the fact that under instructions to fake, 
the various groups raised their mean scores 
approximately one standard deviation. (The 
difference is significant at the .001 level of 
confidence. ) 

Thus it would appear that the answer to 
the question, “Can prospective teachers fake 
the MTAI?” is a qualified “Yes.” Some of 
the students in the above samples faked more 
successfully than others and some faked in 
the wrong direction, but most were able to 
better their scores significantly. 

To the question, “Does signing the answer 
sheets make a difference?” The answer is 
also a qualified “Yes.” As a group, the non- 
signers made lower scores under standard 
directions than did the signers, and conse- 
quently larger gains scores. This might in- 
dicate that some of the nonsigners were more 
frank, less concerned about giving a socially 
acceptable response, than were the signers. 

Perhaps most significant in the framework 
of this study is the range of the gains scores. 
As noted above, when students took the in- 
ventory a second time under changed direc- 
tions, the general tendency was to improve 
one’s score, but one student lowered his score 
by 59 points, while another increased his by 
147, a striking illustration of the common- 
place knowledge that such factors as a stu- 
dent’s beliefs regarding the use to which the 


Table 1 


A Comparison of Original and Faked Scores 





Mean 
Original 
Score 


SD 
Original 


Group Score 


Mean 
Faked 


Differ- 
ence in 
Means 


SD 
Faked 
Score 


Level of 
Signifi- 
cance 


Critical 


Score Ratio 





Prospective 
elementary 
teachers 

Fall 
Spring 


Prospective 
secondary 
teachers 

Fall 
Spring 








194 


scores will be put and his understanding of 
directions may influence his response to an 
inventory. 

This study does not answer the important 
question, “Wall prospective teachers fake the 
MTAI?” However, in view of the above data 
it would appear that one who proposes to use 
the MTAI in a selection program would do 
well to carefully consider the problem of how 
to get the cooperation of his subjects. At 
least in some cases, if the respondent sees rea- 
son to cooperate, e.g., believes that his scores 
are to be used in counseling, he may answer 
the inventory quite differently than he would 
were he to believe that his responses would 
affect his chances of obtaining a job which he 
wanted, or of gaining admission to a training 
program which he desired to enter. 


Summary 


This study was designed to investigate the 
question of whether prospective teachers can 
deliberately “fake” the MTAI, and to learn 
whether the fact that a student signs (or does 
not sign) his name is likely to influence his 
score. 

Four hundred six prospective elementary 
and secondary teachers completed the MTAT, 
first under standard directions, and then un- 
der directions to “fake” the inventory. Half 
of the students signed their names to the an- 


A. Garth Sorenson 


swer sheets, the other half did not. When 
the original and “faked” scores of each stu- 
dent were compared it was found that some 
students had improved their original scores 
greatly, while others had changed their scores 
but in the wrong direction. The group means 
had increased, the difference being significant 
at the .001 level of confidence. 

It appears that signing the answer sheets 
does have an effect, at least in the case of 
some students. The mean original scores of 
the nonsigners was lower than that of the 
signers, while the mean “faked” scores were 
approximately the same. Thus, the “gains” 
score for the nonsigners was somewhat higher, 
the difference being significant at the .01 level 
of confidence.., 


Received June 28, 1955. 


References 


. Cook, W. W., Leeds, C. H., & Callis, R. 
Minnesota Teacher Attitude Inventory. 
York: Psychological Corporation, 1951. 

. Leeds, C. H. A scale for measuring teacher-pupil 
attitudes and teacher-pupil rapport. Psychol. 
Monogr., 1950, 64, No. 6 (Whole No. 312). 

. Rabinowitz, W. The fakeability of the Minne- 
sota Teacher Attitude Inventory. Educ. psy- 
chol. Measmt, 1954, 14, 657-664. 

. Shaw, J., Klausmeier, H. J., Lukes, A. H., & Reid, 
H. T. Changes occurring in teacher-pupil 
attitudes during a two-weeks guidance work- 
shop. J. appl. Psychol., 1952, 36, 304-306. 


The 
New 





The Journal of Applied Psychology 
Vol. 40, No. 3, 1956 


A Note on Measuring “Understandability” 


Robert F. Lockman ' 


Bureau of Naval Personnel 


Although numerous articles on readability 
measurement have been published, the meas- 
urement of prose intelligibility or ‘“under- 
standability” has received little attention. 
Flesch (1) has pointed out that readability 
measures will not indicate whether the ideas 
expressed are nonsense——or ungrammatical, it 
might be added. Crnsequently, a reliable 
measure of understa:\i: » lity would be an im- 
portant supplement to readability, especially 
where readability estimates lack relevance for 
a particular group. Theoretically, wherever 
assessed understandability is low, regardless 
of measured readability level, revision to im- 
prove comprehension of the material in ques- 
tion is indicated. 

To this end, the author devised an experi- 
mental rating scale with seven categories com- 
parable to the standard style descriptions 
used by Flesch (1): very easy, easy, fairly 
easy, standard, fairly difficult, difficult, and 
very difficult. The rating form instructs the 
subject to check one of these descriptions 
with respect to his judgment of the under- 
standability of the material being analyzed. 
For example: “In regard to the material you 
have just read, how hard to understand did 
you think it was? Check one of the follow- 
ing to show your over-all judgment.” The 
instructions and style descriptions can be pre- 
sented orally if necessary. 

Data so obtained are suitable for analysis 
with modal ratings, their corresponding aca- 
demic grade levels, and percentages of ratings 
in each category. Comparisons then can be 
made with reading ease (RE) scores trans- 
lated into grade levels and style descriptions. 
Correlations between RE scores and under- 
standability ratings can be meaningful if scale 


1The author was attached to the Aviation Psy- 
chology Laboratory, U. S. Naval School of Aviation 
Medicine when the data cited in this note were col- 
lected. Opinions or conclusions herein do not neces- 
sarily reflect the views or possess the endorsement 
of the Navy Department. 


directions, types of material analyzed, and 
composition of the rating group are carefully 
considered. 

To cite briefly an application of the fore- 
going technique, RE scores for nine sets of 
directions on standard psychological tests 
used with Naval Aviation Cadets were com- 
puted with the simplified Flesch formula (2). 
From 129 to 273 cadets (median of 171) rated 
these same materials, the number varying 
with test administration schedules. All cadets 
entered the Naval Aviation Cadet Training 
Program during February and March, 1954, 
were from 18 to 25 years of age, had two 
years of college or its equivalent, and had 
been selected on the basis of the Navy Flight 
Aptitude Rating battery and a _ stringent 
physical examination. Because cadets are 
highly selected and relatively homogeneous, 
Flesch RE estimates for the materials used 
are not uniformly relevant. Nevertheless, 
legitimate comparisons of RE and under- 
standability data can be made within the 
sample. Directions were analyzed for tests 
of academic aptitude, spatial orientation, atti- 
tudes, temperament, and personality. 

Flesch RE style descriptions ranged from 
“fairly easy” to “difficult,” the mode being 
“standard” (8th and 9th grade level). Only 
the Frenkel-Brunswik F-Scale instructions 
were at the “college” RE level. In contrast, 
modal understandability ratings of “very 
easy” (53 to 74 per cent of the ratings) oc- 
curred for all materials except the Navy 
Spatial Apperception Test instructions. . In 
this case, 32 per cent rated them “standard” 
and 26 per cent “fairly easy.” Both RE 
scores and understandability ratings had reia- 
tively little spread, and both indicated that 
the test directions or instructions could be 
readily comprehended by cadets. However, 
definite discrepancies in agreement on gen- 
eral style descriptive levels and in the spe- 
cial cases noted existed between the two tech- 
niques. 


195 





196 Robert F. 
Although selection factors will act as de- 
pressors, rank-order and product-moment cor- 
relations were computed to give some indica- 
tion of the relationships between RE scores 
and understandability ratings. Mean under- 
standability values for each set of test direc- 
tions were computed by summing over coded 
rating categories (coded 1 through 7 corre- 
sponding with “very easy” through “very diffi- 
cult’) times the number of raters choosing 
each category, and dividing by the total num- 
ber of raters. These values were ranked from 
low to high. RE scores were ranked from 
high to low, since the higher the score, the 
more readable the material. To correct for 
this scale reversal in the product-moment 
computations, each RE score was subtracted 
from 100 and the difference used as the score. 
The rho was — .65, significant at the .05 level 
in Olds’s tables (3). The product-moment 
coefficient was — .52, but not significantly 
different from zero with an m of 9. Since 
these two coefficients are similar in magnitude 
and direction, it would appear that RE scores 


Lockman 


and understandability ratings were not meas- 
uring the same thing. 

Since high Flesch RE scores are not too 
relevant with highly selected groups, reliable 
understandability ratings could supplement 
them with data on the range and average 
level of intelligibility. When low for a par- 


ticular group, regardless of the material’s 
measured readability level, these indices would 
indicate revision for better comprehension. 
With materials whose RE scores are more 
relevant for the group involved, ratings could 
specify the limits of intelligibility which can- 
not be determined from readability estimates. 


Received June 7, 1955. 


References 


. Flesch, R. How to test readability. 
Harper, 1951. 

2. Lockman, R. F. Readability of NavCad selection 
tests. USN Sch. Aviat. Med. Res. Rep., 1953, 
Rep. No. NM 001 057.16.05. 

3. Olds, E. G. Distributions of sums of squares 
of rank differences for small numbers of indi- 
viduals. Ann. math. Statist., 1938, 9, 133-149. 


New York: 





The Journal of Applied Psycholo 
Vol. 40, No. 3, 1956 al 


GATB in Foreign Countries 


Beatrice J. Dvorak 


Testing Branch, U. S. Employment Service, U. S. Department of Labor 


The USES General Aptitude Test Battery 
has been translated into a number of foreign 
languages and research is being conducted in 
these foreign countries to adapt and stand- 
ardize it for use on populations in those coun- 
tries. About a year ago, this Journal pub- 
lished a list of organizations and individuals 
that had been granted permission by the 
U. S. Employment Service to use the GATB 
in such research. Foreign psychologists have 
expressed considerable interest in that infor- 
mation. Since the list has doubled during the 
past year, an up-to-date list is presented be- 
low. While information is not available re- 
garding the status of all of these projects, it 
is known that the French, Japanese, Portu- 
guese, and Spanish editions have already been 
published. 


Argentina 


Carlos A. Pourteau Agote 
Universidad de Buenos Aires 
Laboratorio Psicotecnico 
Buenos Aires, Argentina 


Australia 
H. A. Bland 
Department of Labour and National Service 
Melbourne, Australia 


Belgium 
R. Buyse 
University of Lourain 
Tournai, Belgium 


R. Dessart 
Compagnie Generale des Conduites D’Eau 
Liége, Belgium 


M. Dewals 

Psychotechnicien de la Société Nationale des 
Chemins de Fer Vicinaux 

Bruxelles, Belgium 


J. Gillet 
Ministére de la Defense Nationale 
Bruxelles, Belgium 


Jean Herickx 
Centre d’Orientation 
Bruxelles, Belgium 


P. Houssa 
Brugmann Hospital 
Bruxelles, Belgium 


F. Vandenborre 
Ministére de ]’Instruction Publique 
Bruxelles, Belgium 


Brazil 
Joal Baptista d’Avilla 
Servico Nacional De Aprendizagem Industrial 
Sao Paulo, Brazil 


Jacy Magalhaes 
Divisao de Organizacao do Trabalho 
Rio de Janeiro, Brazil 


Livraria Oscar Nicolai 
Belo Horizonte, Brazil 


Eugene Novgorodoff 

Ladeira Tabajaras 140, Apto 904 
Copacabana 

Rio de Janeiro, Brazil 


S. J. Schwarzstein 
Servico de Colocacao e Informacao Profissional 
Sao Paulo, Brazil 


Canada 
G. P. Cosgrave 
Director, Counseling Service 
The Toronto Young Men’s Christian Association 
Toronto, Canada 


J. Fred Dawe 
Civil Service Commission 
Ottawa, Canada 


Thomas Fishbourne 
Canadian National Employment Service 
Ottawa, Canada 


Morgan D. Parmenter 
University of Toronto 


_ Toronto, Canada 


China 


Ministry of Social Affairs 
Shanghai, China 


Cuba 
Jose M. Gutierrez 
Universidad de la Habana 
Habana, Cuba 


197 








198 


Denmark 
Poul Bahnsen 
Director, Psykotekniske Institut 
Copenhagen, Denmark 


Paul Vidriksen 
Arbejdsdi Rektoratet 
Copenhagen, Denmark 


Egypt 
S. A. Batraur 
Egyptian Army 
Cairo, Egypt 


S. A. Morsi 
Egyptian Army 
Cairo, Egypt 


El] Salvador 


Hector Garay Pacheco 
Ministro de Trabajo y Prevision Social 
San Salvador, El Salvador 


Mario Hector Salazar 
Ministro de Trabajo y Prevision Social 
San Salvador, El Salvador 


England 
S. M. Cox 
Westminster Hospital 
London, England 


M. Desai 

Psychological Department 
London County Council 
London, England 


H. J. Eysenck 
The Maudsley Hospital 
London, England 


Edward Fox 

Winwick and Newchurch Hospital Management 
Committee 

Warrington, England 


C. B. Frisby 

Director, National Institute of Industrial Psy- 
chology 

London, England 


Roland Harper 

The University of Leeds 
Leeds, England 

D. R. Martin 


The University of Leeds 
Leeds, England 


Constance M. Mathieson 
East Anglian Regional Hospital Board 
Norwich, England 


Beatrice J. Dvorak 





C. E. Mitchell 

St. Francis Hospital 
Hayward Heath, England 
G. Naylor 

Rainhill Hospital 
Rainhill, England 

C. J. Price 
Westminster Hospital 
London, England 

B. W. Richards 

St. Laurence’s Hospital 
Caterham, England 


Alec Rodger 
Birkbeck College 
University of London 
London, England 


J. Tizard 
The Maudsley Hospital 
London, England 





India 


N. R. Chattopadhyay 
Department of Psychology 
University of Calcutta 
Calcutta, India 


D. K. Dator 
Ministry of Labour 
Bombay, India 


Bhim S. Narula 
Ministry of Labor 
New Delhi, India 


Vocational Guidance Bureau 
Bombay, India 


Israel 
Esther Gottstein 
Mental Health Clinic 
Jerusalem, Israel 


Italy 
Silvano Chiari 
Centro Di Orientamento Scolastico Professionale 
Firenze, Italy 


Gastone Conti 
Instituto Tecnico Industriale 
Udine, Italy 


Vincenzo Flagiello 

Societa per l'Industria e ]’Ellettricita 
Centro Istruzione Professionale 
Terni, Italy 











GATB in Foreign Countries 


Agostino Gemelli 
Director, Laboratorio di Psicologia Sperimentale 
Milano, Italy 


Instituto Nazionale di Psicologia 

Rome, Italy 

Guido Majaron 

Viale Arnaldo Fusinato 2F 

Vicenza, Italy 

Luigi Meschieri 

Department of Labor 

Rome, Italy 

Vasco Pisani 

Consorzio Provinciale Istruzione Tecnica 
Centro Di Orientamento Scolastico Professionale 
Siena, Italy 

Giorgio Tampieri 

Consorzio Provinciale Istruzione Tecnica 
Centro Orientamento Scolastico Professionale 
Trieste, Italy 


Japan 
Gregory Kihachi Fujimoto 
Rikkyo University 
Tokyo, Japan 
T. Kondo 
Employment Security Bureau 
Tokyo, Japan 
Hiroshi Matsumoto 
Ministry of Labor 
Tokyo, Japan 


Malta 


John Patrick Hamilton 
Department of Labour 
Valetta, Malta 


Mexico 
Matias Lopez, Jr. 
Maria Montessori Psychology Laboratory 
Tlaxcala, Mexico 


New Zealand 


Auckland University College 
Auckland, New Zealand 


W. J. H. Clark 


Vocational Guidance Centre 
Auckland, New Zealand 


Pakistan 
Rafi Z. Khan 
Pakistan Public Service Commission 
Karachi, Pakistan 


199 


Peru 
Santiago Salinas 
Ministerio de Trabajo y Asuntos Indigenas 
Lima, Peru 


Philippines 
Apolinario Garcia Apilado 
Northern Luzon School of Arts and Trades 
Vigan, Ilocos Sur, Philippines 


Florencio Mones Apolinar 
Vocational Education Division 
Bureau of Public Schools 
Department of Education 
Manila, Philippines 


Alberto Bernardo Garcia 

Central Luzon School of Arts and Trades 
Cabanatuan City, Philippines 

Wenceslao Gozan 

Department of Labor 

Manila, Philippines 


Rodrigo L. Jarabe 
Philippine Employment Service 
Manila, Philippines 


Antonio V. Roxas 
Escolta, Manila, Philippines 


Guillermo Torres 
Mindanao College 
Davao, Mindanao, Philippines 


Scotland 
P. S. Boyd 
Department of Mental Health 
Aberdeen, Scotland 


W. M. Miller 


Department of Mental Health 
Aberdeen, Scotland 


South Africa 


Department of Psychology 
University of Stellenbosch 
Stellenbosch, South Africa 


C. P. J. Erasmus 

University of the Orange Free State 
Bloemfontein, South Africa 

Evry] Fisher 

Church Street 

Cape Town, South Africa 

D. J. Du Plessis 

Department of Labor 
Johannesburg, South Africa 





200 


J. J. Scheepers 
Department of Labor 
Johannesburg, South Africa 


Sweden 
Torsten Husen 
Cintrala Varnpliktsbyran 
Personalprovingsdetaljen 
Stockholm, Sweden 


Switzerland 
J. F. Herzog 
Office d’Orientation Professionelle 
Neuchatel, Switzerland 


Ph. H. Muller 
Université de Neuchatel 
Neuchatel, Switzerland 


Beatrice J. Dvorak 


Turkey 
Faruk Kardam 
Turkish Employment Service 
Ankara, Turkey 


Venezuela 
John R. Boulger 
Socony-Vacuum Oil Company of Venezuela 
Apartado No. 246 
Caracas, Venezuela 


Various Foreign Countries 


Edwin R. Henry 
Employment Relations Department 
Standard Oil Company 


Received July 11, 1955. 














REVUE DE PSYCHOLOGIE APPLIQUEE 


PUBLICATION TRIMESTRIELLE 


Directeurs : D' P. PICHOT et P. RENNES 


Cette Revue s’adresse aussi bien aux cliniciens (psychologues ou psychiatres), 
qu’aux psychotechniciens (orienteurs, psychologues de la saints” 
Deux rubriques sont orientées vers l’application : Techniques et Méthodes 
Be ie ita cee kes Pocus aco cos tema dn at oe. 
Ces rubriques ont pour but d’exposer sous une forme précise et con- 
créte les techniques fondamentales; d’éclairer des points douteux, de présenter, 
méme sous forme d’aide-mémoire, les méthodes pratiques de conduite des appli- 
cations. Elles sont complétées par des Revues générales qui permettent de faire 
le point des recherches y des domaines intéressant directement I’application. 
a s la rubrique Travaux originaux prennent place des études d’ordre plus 


Enfin les autres rubriques Chroniques et Documentation et Analyses don- 
nent, tant sur le plan technique que sur le plan professionnel, un tableau de la 
vie quotidienne en psychologie appliquée. 


Rédaction et Administration : 15, rue Henri-Heine - PARIS (XVI*) 
C. C. PARIS 5851-62 


ABONNEMENTS: 1 an, France: 1.000 francs - Etranger : 1.300 francs 
NUMERO SPECIMEN SUR DEMANDE 











